Você está na página 1de 6

Federal Reserve Economic Data Estimation and Projection

ESE 499 Proposal February 10, 2014 April 9, 2014 Submitted to: Professor Dennis Mell Systems Science and Engineering Washington University on March 23, 2014

Submitted by: Sean Feng Senior in Systems Engineering and Finance 665 South Skinker Boulevard St. Louis, MO 63105 (989) 488-3991 sean.feng@wustl.edu Peter Suyderhoud Senior in Systems Engineering and Economics 7116 Forsyth Boulevard St. Louis, MO 63105 (703) 405-5340 petersuyderhoud@gmail.com

Under the direction of: Mr. Keith Taylor Data DeskResearch Division Federal Reserve Bank of St. Louis One Federal Reserve Bank Plaza Broadway and Locust Streets St. Louis, MO 63101 (314) 444-4211 Keith.g.taylor@stls.frb.org

Abstract of Federal Reserve Economic Data Estimation and Projection Sean Feng and Peter Suyderhoud The aim of this project is to utilize systems engineering techniques to accurately estimate and predict economic metrics including Gross Domestic Product (GDP) as reported by the Federal Reserve Bank of St. Louis (St. Louis FED). As the St. Louis FED takes initiatives to expand the number of series it tracks and reports, the data desk is facing difficulties in confirming data estimates and generating new projections. Our work will serve as a preliminary foundation for the St. Louis FED to more effectively validate data for its vintages. Using both archival and real-time data from the Archival Federal Reserve Economic Data (ALFRED), we will apply concepts such as random processes and statistical regression to update existing estimations and project new predictions for the coming period. We will implement our algorithms using Matlab and similar coding platforms in order to provide an initial framework for the St. Louis Fed to adapt for its proprietary software.

Resource Requirements In order to complete our project, we will need computer time in the lab, access to Matlab and other similar platforms, advising time with ESE and economics professors, and client time. 1. Historical Economic Data from ALFRED a. Client has provided us with sufficient data 2. Faculty Advising Time a. Randall Hoven (Electrical and Systems Engineering) b. Two hours/month 3. Client Meeting Time a. Keith Taylor b. Five hours/month 4. Computer Access a. Urbauer 214 computer lab b. Matlab, Stata c. Estimated 50 hours of usage

2|Page

Table of Contents Abstract ........................................................................................................................................... 2 Resource Requirements .................................................................................................................. 2 Introduction ..................................................................................................................................... 4 Problem Statement .......................................................................................................................... 4 Proposed Technical Approaches ..................................................................................................... 4 Proposed Specific Tasks ................................................................................................................. 5 Expected Solutions and Recommendations .................................................................................... 5 Deliverables .................................................................................................................................... 6 Schedule .......................................................................................................................................... 6

3|Page

Introduction Federal Reserve Economic Data (FRED) is an online database maintained by the Research division of the Federal Reserve Bank of St. Louis containing more than 140,000 economics time series from approximately 60 sources. Examples of these time series include consumer price indexes, employment, population, exchange rates, interest rates, and gross domestic product. These series are collected from agencies such as the U.S. Census and the Bureau of Labor Statistics and compiled by the Data Desk, the team responsible for all data found at FRED. In addition to tracking the real time observations of each times series, the Data Desk also keeps vintages of these series for revision purposes. On the initial release date, values in each time series are assigned preliminary values which are subsequently revised over the period. The time series is revised twice total; the third value becomes the final observation.

Problem Statement

Last year, the data desk increased the number of series in FRED by 100%, with plans to increase it by the same amount for the year ended December 2014. As the size of the database increases, the data desk is having difficulty adequately validating the data before writing it to the database. The current data validation method requires an excess of time and resources, as it involves error checks for each individual data entry. As the St. Louis FED rapidly scales up FRED, it faces the challenge of validating new data fast enough to keep up with new entries. We aim to create an algorithm that uses existing vintages in order to create an estimate for the coming period that will be compared to the actual vintage when it is released. We will need to differentiate pure economic noise from human error and methodology revisions. We intend to create this algorithm specifically for GDP data, since it is the most complete and least noisy data series. This estimation will allow the data desk to quickly validate new vintage entry by comparing the values to a generated estimation. If the values fall within a predetermined confidence interval, then the data desk can validate the entire vintage without reviewing each individual data point in the series.

Proposed Technical Approaches Our project will rely most on statistical approaches highlighted in the systems engineering and economics curriculums at Washington University in St. Louis. The primary focus of this project will be on developing statistical models in order to best use the extensive historical data to revise and predict new data. Potential topics required for the project include, but are not limited to: Random Processes and Kalman Filtering, Probability and Statistics, Operations Research, Optimization, Algorithms and Data Structures, and Econometrics. Random Processes. The input data itself contains considerable random noise. This noise can be divided into three categories: economic noise, methodology revisions, and human error. In our algorithm, we will primarily address the effects of economic noise and methodology revisions, while disregarding human error since there already exists raw data that edits this error out. In addition, we will need to set acceptable error margins for our output, given the level of input error and our tolerance for error.
4|Page

Statistical Model Identification. In order to properly regress the data to provide us with revisions and predictions, we will need to identify an appropriate mathematical model that relates the historical data to the newest vintage. We will explore a variety of models, including linear, exponential, logarithmic, and polynomial regressions techniques. We must also determine the limits for the scope the input, in order to remove irrelevant or outdated data from our model. Algorithms and Data Analysis. After distinguishing the input error and identifying a model, we will need to use Matlab or another platform to develop a code that inputs historical data in order to generate revisions and predictions. This process will require us to explore different implementation techniques, since not all methods will provide us with equally accurate or plausible results.

Proposed Specific Tasks First, we must categorize the error in the input data. We know that the GDP data improves over time as more data becomes available. Moreover, we know that typically predictions are made quarterly and then are revised monthly such that there is one initial estimate followed by two revisions. By the third month in the quarter, the value tends to stay constant until there is a fundamental methodology change. Our task in respect to the input error is to estimate the changes in the revisions, while filtering out the effect from the methodology changes, since these are infrequent and are announced ahead of time. Second, we must identify the dynamics of GDP. For example, we should try to determine whether there is there any inertia to GDP, or if we can determine a probability distribution on x(k+1) given good values for x(k-n) through x(k). After answering such questions, we will be able to determine which mathematical model we can use for our statistical regression. Then, we will need to use a program such as Matlab or Stata in order to implement our statistical model to provide us with the necessary revisions and predictions for the newest vintage. Finally, we will backtest our algorithm with the historical data to see how valid the revisions and predictions are. Afterwards, we will need to make the final code as user friendly as possible such that the client will be able interpret the results. We also aim to add a place to input the new vintage so that the client can easily compare the estimation with the actual observation. We will prepare our findings as a deliverable to the client and explain ways to apply our findings for GDP to other data sets. Afterwards, we will post our findings to the project website and submit a final report to the client.

Expected Solutions and Recommendations We expect to be able to provide the client with a coding foundation that will output an estimated vintage with a given error margin that the client can use to quickly validate new vintages. Since GDP is the most standardized data sets within Alfred, we expect that extending this foundation for less established economic metrics may add further challenges beyond the scope of the
5|Page

project. However, we will aim to provide the client with direction for how to apply the code to other data sets, including actionable steps to replicate the process.

Deliverables Our primary deliverable will be the algorithm foundation for the statistical regression. This algorithm will take a data set as an input, as well as initial parameters such as acceptable error margins, and output a vintage estimation. In addition, we will provide a memorandum on how the client can use this algorithm for estimating other data sets.

Schedule February 10 14, 2014 February 15 22 , 2014 February 23 March 1, 2014 March 2 8, 2014 March 9 15, 2014 March 16 22, 2014 March 23 29, 2014 March 30 April 5, 2014 April 6 9, 2014 Write and finalize proposal for client Meet with professors to discuss solutions Meet with client to revise proposal Outline solution and structure algorithm Spring break Write algorithm code Test code with vintage data Draft and finish final paper Submit final paper Publish models and findings via website

6|Page

Você também pode gostar