Skip Navigation | |
Unless otherwise indicated, all years referred to in this report are calendar years. The figures in this report use shaded vertical bars to indicate periods of recession. Those bars extend from the peak to the trough of the recession. Numbers in the text and tables may not add up to totals because of rounding. |
This report supplements the Congressional Budget Office's (CBO's) The Economic and Budget Outlook: An Update (July 1, 1999). In accordance with CBO's mandate to provide objective and impartial analysis, it contains no recommendations.
The analysis was prepared by Matthew Salomon under the direction of
Robert Dennis, Kim J. Kowalewski, and John F. Peterson. Ezra Finkin and
Michael Simpson provided research assistance. Leah Mazade edited the report,
and Sherry Snyder proofread it. Kathryn Quattrone prepared the report for
final publication, and Laurie Brown prepared the electronic versions for
CBO's World Wide Web site (www.cbo.gov).
Dan L. Crippen
Director
July 30, 1999
MEASURING THE QUALITY OF FORECASTS
APPENDIX: SOURCES OF DATA FOR
THE EVALUATION
TABLES | |
1. | Summary Measures of CBO, Administration, and Blue Chip Forecasting Performance |
A-1. | Comparison of CBO, Administration, and Blue Chip Forecasts of Two-Year Average Growth Rates for Nominal Output |
A-2. | Comparison of CBO, Administration, and Blue Chip Forecasts of Two-Year Average Growth Rates for Real Output |
A-3. | Comparison of CBO, Administration, and Blue Chip Forecasts of Two-Year Average Inflation Rates in the Consumer Price Index |
A-4. | Comparison of CBO, Administration, and Blue Chip Forecasts of Two-Year Average Nominal Interest Rates on Three-Month Treasury Bills |
A-5. | Comparison of CBO, Administration, and Blue Chip Forecasts of Two-Year Average Interest Rates on Ten-Year Treasury Notes |
A-6. | Comparison of CBO, Administration, and Blue Chip Forecasts of Two-Year Average Real Interest Rates on Three-Month Treasury Bills |
A-7. | Comparison of CBO and Administration Forecasts of the Two-Year Change in Wage and Salary Distributions Plus Corporate Book Profits as a Share of Output |
A-8. | Comparison of CBO, Administration, and Blue Chip Projections of Five-Year Average Growth Rates for Nominal Output |
A-9. | Comparison of CBO, Administration, and Blue Chip Projections of Five-Year Average Growth Rates for Real Output |
FIGURES | |
1. | Actual and Forecast Two-Year Average Rates of Growth for Nominal Output |
2. | Distribution of Errors by Blue Chip Forecasters in Forecasting Two-Year Average Growth of GDP |
3. | Wage and Salary Disbursements Plus Corporate Book Profits |
4. | Distribution of Errors by Blue Chip Forecasters in Forecasting Two-Year Average Growth in Corporate Profits |
5. | Statistical Discrepancy |
6. | Wage and Salary Disbursements |
Since publishing its first macroeconomic forecast in 1976, the Congressional Budget Office (CBO) has compiled a forecasting track record comparable in quality with those of a sizable sample of private-sector forecasters as well as five Administrations. CBO's errors for forecasts looking two years ahead that were made between 1982 and 1997 did not differ markedly from either those of the Administration or the central tendency of the 50 or so forecasts that have made up the Blue Chip survey over the years. Comparing CBO's forecasts with that survey suggests that when CBO's economic predictions missed the mark by a margin wide enough to contribute to sizable misestimates of the deficit or surplus, those errors probably reflected limitations that confronted all forecasters. That result is not surprising because all forecasters, when making their predictions, have the same basic information available about the state of the economy, which they may then interpret differently. Moreover, CBO examines other forecasts when constructing its own, and CBO's forecast in turn may affect others in a similar way.
Because forecasters have underestimated real growth and overestimated
inflation in recent years, CBO focused on the errors in its forecasts made
in early 1996 and early 1997. (See the appendix for
sources of the data used in the evaluation and details of CBO's track record.)
As it turns out, CBO's errors in those forecasts were, in most cases, quite
similar to those of the Administration and the Blue Chip forecasters.
Those conclusions echo the findings of studies of earlier periods by CBO
and by other government and academic reviewers.
Measuring the Quality of Forecasts
Following earlier studies of economic forecasts, the evaluation of CBO's forecasts focused on two aspects of their quality: statistical bias and accuracy. Other desirable characteristics--such as the efficiency of a forecast, which is discussed later--are harder to assess definitively and would require a larger sample than is available for CBO's forecasts.
Bias
The statistical bias of a forecast is the extent to which the forecast can be expected to differ from what actually occurs. CBO's evaluation used the mean error to measure statistical bias. That statistic--the arithmetic average of all the forecast errors--is the simplest and most widely used measure of forecast bias. Because the mean error is a simple average, however, underestimates and overestimates offset each other in calculating it. As a result, the mean error imperfectly measures the quality of a forecast--a small mean error would result either if all the errors were small or if all the errors were large but the overestimates and underestimates happened to balance each other out.
Accuracy
The accuracy of a series of forecasts is the degree to which their values are narrowly dispersed around actual outcomes. Measures of accuracy more clearly reflect the usual meaning of forecast quality than does the mean error. CBO's evaluation used two measures of accuracy. The mean absolute error--the average of the forecast's errors without regard to arithmetic sign--indicates the average distance between forecasts and actual values without regard to whether individual forecasts are overestimates or underestimates. The root mean square error--calculated by first squaring all the errors, then taking the square root of the arithmetic average of the squared errors--also shows the size of the error without regard to sign, but it gives greater weight to larger errors.
Other Measures of Forecast Quality
In addition to the three statistical indicators noted above, there are many other measures of a forecast's quality. To test for statistical bias in CBO's forecasts, studies by analysts outside CBO have used measures that are slightly more elaborate than the mean error. Those studies have generally concluded, as does this evaluation, that CBO's short-term economic forecasts do not contain a statistically significant bias.(1)
A number of other methods have been developed to evaluate a forecast's efficiency. Efficiency indicates the extent to which a particular forecast could have been improved by using additional information that was at the forecaster's disposal when the forecast was made.(2) The Blue Chip consensus forecasts represent a wide variety of economic forecasters and thus reflect a broader blend of sources and methods than can be expected from any single forecaster. In this evaluation, the Blue Chip predictions can therefore serve as a proxy for an efficient forecast. The fact that CBO's forecasts are about as accurate as the Blue Chip's is a rough indication of their efficiency.
Such elaborate measures and methods, however, are not necessarily reliable indicators of a forecast's quality when the sample of observations is small, such as the 21 observations that make up the sample of CBO's two-year forecasts. Small samples present three main problems for evaluating forecasts. First, small samples reduce the reliability of statistical tests that are based on the assumption that the underlying population of errors in the forecast follows a normal distribution. The more elaborate measures of forecast quality all make such an assumption about the hypothetical ideal forecast with which the actual forecasts are being compared. Second, in small samples, individual errors in the forecast can have an unduly large influence on the measures. The mean error, for example, can fluctuate in its arithmetic sign when a single observation is added to a small sample. Third, the small sample means that CBO's track record cannot be used in a statistically reliable way to indicate either the direction or the size of future forecasting errors.
Apart from the general caveat that should attend any statistical conclusions,
several other reasons argue for viewing any evaluation of CBO's forecasts
with particular caution. First, the procedures and purposes of CBO's and
the Administration's forecasts have changed over the past 20 years and
may change again in the future. For example, in the late 1970s, CBO characterized
its long-term projections as a goal for the economy; it now considers its
projections to be what will prevail, on average, if the economy continues
to reflect historical trends. Unlike CBO's projections, the Administration's
have always included the projected economic effects of their own policy
proposals. Second, an institution's track record in forecasting may not
be indicative of its future abilities because of changes in personnel or
methods. Finally, errors in a forecast increase when the economy is more
volatile. All three groups of forecasters--CBO, the Administration, and
the Blue Chip survey--made exceptionally large errors when forecasting
for periods that included turning points in the business cycle.
Over the years, the average differences between the forecasts of CBO, the Administration, and the Blue Chip consensus have tended to be small. Recently, all three groups of forecasters underestimated economic growth and overestimated price inflation. As a result, the net effects of those misestimates on the errors in the forecasts of nominal output have been smaller than would be implied by either of the two misestimates alone.
Summary Measures of Forecast Quality
In evaluating its forecasting record, CBO considered how well it had done over both two- and five-year periods. The two-year period is of special importance. Both the Administration's and CBO's winter budget publications focus on budget projections for the fiscal year that begins the following October. An economic forecast that is accurate for the budget year itself will provide the basis for more accurate budget projections. CBO also used a five-year period in its evaluation to examine the accuracy of longer-term projections of growth in nominal and real (adjusted for inflation) output.
Overall, forecasts by CBO, the Administration, and the Blue Chip
consensus are quite similar for the two-year horizon (see Table 1).(3)
Although the margin is slight, CBO's mean absolute errors are smaller than
the Administration's for growth in nominal and real output, inflation,
and long-term interest rates. Over the five-year horizon, CBO, the Administration,
and the Blue Chip consensus all tended toward optimism in their
projections for growth of nominal and real output. CBO's projections for
real growth over the long term appear comparable in accuracy with those
of the Blue Chip survey.
Table 1. Summary Measures of CBO, Administration, and Blue Chip Forecasting Performance (In percentage points) |
||||||||
CBO | Administration | Blue Chip | ||||||
|
||||||||
Average Error for Two-Year Forecasts | ||||||||
Growth of Nominal Output | ||||||||
Mean error | 0.4 | 0.6 | 0.4 | |||||
Mean absolute error | 1.0 | 1.1 | 1.0 | |||||
Root mean square error | 1.4 | 1.5 | 1.2 | |||||
Growth of Real Output | ||||||||
Mean error | -0.2 | 0 | -0.3 | |||||
Mean absolute error | 0.8 | 1.0 | 0.8 | |||||
Root mean square error | 1.1 | 1.3 | 1.0 | |||||
Inflation in the Consumer Price Index | ||||||||
Mean error | 0.6 | 0.5 | 0.6 | |||||
Mean absolute error | 0.7 | 0.8 | 0.8 | |||||
Root mean square error | 0.9 | 1.0 | 1.0 | |||||
Interest Rates | ||||||||
Three-month Treasury bills--nominal | ||||||||
Mean error | 0.4 | -0.1 | 0.4 | |||||
Mean absolute error | 1.0 | 1.0 | 0.9 | |||||
Root mean square error | 1.3 | 1.2 | 1.1 | |||||
Ten-year Treasury notes--nominal | ||||||||
Mean error | 0.2 | -0.3 | 0.3 | |||||
Mean absolute error | 0.6 | 0.9 | 0.6 | |||||
Root mean square error | 0.8 | 1.1 | 0.7 | |||||
Three-month Treasury bills--real | ||||||||
Mean error | -0.2 | -0.6 | -0.3 | |||||
Mean absolute error | 0.9 | 0.8 | 0.8 | |||||
Root mean square error | 1.1 | 1.1 | 1.0 | |||||
Change in Wage and Salary Disbursements Plus Corporate Book Profits as a Share of Output | ||||||||
Mean error | 0 | 0.2 | n.a. | |||||
Mean absolute error | 1.0 | 1.0 | n.a. | |||||
Root mean square error | 1.3 | 1.2 | n.a. | |||||
Average Error for Five-Year Projections | ||||||||
Growth of Nominal Output | ||||||||
Mean error | 1.3 | 1.3 | 0.9 | |||||
Mean absolute error | 1.4 | 1.3 | 0.9 | |||||
Root mean square error | 1.6 | 1.6 | 1.1 | |||||
Growth of Real Output | ||||||||
Mean error | 0.3 | 0.7 | 0.2 | |||||
Mean absolute error | 0.5 | 0.9 | 0.5 | |||||
Root mean square error | 0.8 | 1.1 | 0.6 | |||||
|
||||||||
SOURCES: Calculations by the Congressional Budget Office using data from CBO; Office of Management and Budget; Aspen Publishers, Inc., Blue Chip Economic Indicators; Department of Commerce, Bureau of Economic Analysis; Department of Labor, Bureau of Labor Statistics; and the Federal Reserve Board. | ||||||||
NOTES: The calculations include two-year forecasts made between 1982 and 1997 except for the 10-year rate on Treasury notes (which covered the 1984-1997 period) and the change in wage and salary disbursements plus corporate book profits (which covered the 1980-1997 period). For the five-year projections, calculations for growth of nominal and real output covered 1982 to 1994 and 1979 to 1994, respectively. For additional details on those calculations, see the appendix. | ||||||||
n.a. = not available. | ||||||||
|
In no case, however, do the differences among the three forecasts appear to be large enough to be statistically significant. The small number of forecasts available for analysis makes it difficult to distinguish meaningful differences in their quality from those that might arise randomly. Indeed, other descriptive statistics that are less sensitive to the small size of the sample tend to support the conclusion that the differences between the CBO, Administration, and Blue Chip forecasts are purely random.(4) In any case, the statistics presented here should not be construed as reliable indicators of the future quality of any of the forecasters.
Forecasts Made in Early 1996 and 1997
In recent years, the economy has grown at a rate in excess of CBO's estimate of its potential even as the rate of inflation has declined. That pattern, which in some respects is the converse of the stagflation that plagued the United States in the 1970s, surprised analysts. As a result, most forecasters underestimated the rate of economic growth and overestimated inflation for the 1996-1998 period.
The forecasts of CBO, the Administration, and the Blue Chip consensus for those years were no different in that regard. In the forecasts made in early 1996 and 1997, CBO, the Administration, and the Blue Chip survey all underestimated the two-year growth in real gross domestic product (GDP) by more than 1.5 percentage points, on average. At the same time, all the forecasters overestimated rates of inflation--in forecasts made in early 1997, they overestimated two-year inflation in the consumer price index by significant magnitudes ranging from 0.8 percentage points (the Administration) to 1 percentage point (CBO and Blue Chip). By contrast, errors in the 1996 and 1997 forecasts for nominal interest rates were generally small in comparison with other historical periods.
The net result of the pessimistic forecasts of growth and inflation
in the 1996-1998 period has been an underestimate of total nominal GDP
(see Figure 1). However, because the underestimate of real growth was partly
offset by the overestimate of inflation, forecasters underestimated the
rate of two-year growth in nominal GDP by about half as much as they underestimated
growth in real output. CBO's errors in forecasting nominal GDP growth have
not been extraordinary in recent years--they have been within the central
tendency of the forecasts that make up the Blue Chip consensus (see
Figure 2). That suggests that most private-sector forecasters interpreted
the economic data in 1996 and 1997 in the same way that CBO forecasters
did. Indeed, all the forecasters in the Blue Chip survey underestimated
two-year GDP growth in 1996, and all but one (the Conference Board) underestimated
growth in 1997.
Figure 1. Actual and Forecast Two-Year Average Rates of Growth for Nominal Output |
|
|
SOURCES: Congressional Budget Office; Department of Commerce, Bureau of Economic Analysis. |
|
Figure 2. Distribution of Errors by Blue Chip Forecasters in Forecasting Two-Year Average Growth of GDP |
|
|
SOURCES: Congressional Budget Office; Aspen Publishers, Inc., Blue Chip Economic Indicators; Department of Commerce, Bureau of Economic Analysis. |
NOTES: The forecast error is defined as the predicted minus the actual rate of growth. |
The survey included 50 forecasts for 1996 to 1997 and 40 forecasts for 1997 to 1998. |
|
Although the forecast errors for nominal GDP in 1996 and 1997 were not
particularly large, CBO's forecasts in those years were much too pessimistic
about the share of GDP that represented wages and salaries and corporate
book profits. That share is particularly important for budget projections
because those two income components form the basis of forecasts of revenues.
In recent years, the share exhibited unusual movements--after generally
declining for four decades and then apparently stabilizing in the late
1980s, wages and profits began to rise rapidly as a share of GDP in 1996
and have risen every year since then (see Figure 3). CBO underestimated
the two-year change in that share by just over 1.5 percentage points, on
average, while the Administration's underestimates averaged just under
1 percentage point. Those underestimates contributed to an underestimate
of combined federal revenues for a given level of total GDP.(5)
Figure 3. Wage and Salary Disbursements Plus Corporate Book Profits |
|
|
SOURCES: Congressional Budget Office; Department of Commerce, Bureau of Economic Analysis. |
|
Part of CBO's underestimate resulted from misestimates of corporate
profits. Traditionally, corporate profits have been one of the least predictable
components of national income, and recent experience has been no exception.
CBO slightly overestimated the two-year growth in corporate profits in
its 1996 forecast and then underestimated that growth in its 1997 forecast.
In neither case, however, did CBO's forecasts depart significantly from
the central tendency of the Blue Chip forecasters (see Figure 4).
Figure 4. Distribution of Errors by Blue Chip Forecasters in Forecasting Two-Year Average Growth in Corporate Profits |
|
|
SOURCES: Congressional Budget Office; Aspen Publishers, Inc., Blue Chip Economic Indicators; Department of Commerce, Bureau of Economic Analysis. |
NOTES: Corporate profits equal the book value of corporate profits with adjustments for inventory and capital consumption. |
The forecast error is defined as the predicted minus the actual rate of growth. |
The survey included 47 forecasts for 1996 to 1997 and 37 forecasts for 1997 to 1998. |
|
Perhaps more fundamental to understanding the recent errors in forecasting
taxable income is the degree to which total income has exceeded total product
in the national income and product accounts (NIPAs). In principle, those
two aggregate measures of economic activity should be equal, but in practice
they are not, largely because the Bureau of Economic Analysis, which publishes
the NIPAs, must use different primary sources to estimate total income,
on the one hand, and total product, on the other. The statistical discrepancy
in the NIPAs measures the difference between total product and total income;
in recent years, the excess of total income over total product has widened
and gives no indication of narrowing (see Figure 5).
Figure 5. Statistical Discrepancy |
|
|
SOURCES: Congressional Budget Office; Department of Commerce, Bureau of Economic Analysis. |
NOTE: The statistical discrepancy is defined in the national income and product accounts as the difference between total product and total income. |
|
The widening of the discrepancy presents a problem for forecasters who must make assumptions about the future course of the discrepancy. If a forecaster has assumed, in line with historical experience, first, that the discrepancy will revert toward zero and second, that most of the discrepancy is due to mismeasurements on the income side, then that forecaster will have been more apt to understate income in recent years. At this point, it is impossible to tell exactly how much the discrepancy has caused forecasters to err in their income-side forecasts, but the sheer size of the imbalance in recent years compounds the importance of each forecaster's assumptions about how to forecast the discrepancy. Forecasters' use of alternative and mutually exclusive assumptions for resolving that imbalance--each assumption as reasonable as the next--could broaden the dispersion of forecasts of total income in coming years.
An additional source of difficulty in forecasting taxable income as
a share of GDP has been the reversal in another long-standing trend, that
of nonwage labor income (employer-paid insurance premiums, pension contributions,
and other fringe benefits). Although total labor compensation (nonwage
income plus wages and salaries) has remained relatively stable as a share
of nominal output in recent years, nonwage labor income has begun to decline
as a share of total compensation. The decline began in 1995; its effect
has been to increase the share of compensation that is taxed at a higher
rate, that is, wages and salaries (see Figure 6). Once again, that turnaround
was relatively unpredictable and as yet is imperfectly understood.
Figure 6. Wage and Salary Disbursements |
|
|
SOURCES: Congressional Budget Office; Department of Commerce, Bureau of Economic Analysis. |
NOTE: In the national income and product accounts, total labor compensation equals wage and salary disbursements plus nonwage labor income. |
|
1. Another approach to testing a forecast for bias is based on linear regression analysis of actual and forecast values. For details of that method, see J. Mincer and V. Zarnowitz, "The Evaluation of Economic Forecasts," in J. Mincer, ed., Economic Forecasts and Expectations (New York: National Bureau of Economic Research, 1969). That approach is not used here because of the small size of the sample. However, previous studies that have used it to evaluate the short-term forecasts of CBO and the Administration have not been able to reject the hypothesis that those forecasts are unbiased. See, for example, M.T. Belongia, "Are Economic Forecasts by Government Agencies Biased? Accurate?" Review, Federal Reserve Bank of St. Louis, vol. 70, no. 6 (November/December 1988), pp. 15-23. For a more recent and more elaborate study of forecast bias that included CBO's forecasts among a sizable sample, see David Laster, Paul Bennett, and In Sun Geoum, Rational Bias in Macroeconomic Forecasts, Staff Report No. 21 (New York: Federal Reserve Bank of New York, March 1997).
2. For studies that have examined the relative efficiency of CBO's economic forecasts, see Belongia, "Are Economic Forecasts by Government Agencies Biased?"; and S.M. Miller, "Forecasting Federal Budget Deficits: How Reliable Are U.S. Congressional Budget Office Projections?" Applied Economics, vol. 23 (December 1991), pp. 1789-1799. Although both studies identify series that might have been used to make CBO's forecasts more accurate, they rely on statistics that assume a larger sample than is available. Moreover, although statistical tests can identify sources of inefficiency in a forecast after the fact, they generally do not indicate how such information can be used to improve forecasts when they are being made.
3. More detailed information is presented in Tables A-1 through A-9 in the appendix.
4. Pairwise comparisons of the forecast errors using the Mann-Whitney-Wilcoxon U test supported that conclusion in all instances.
5. The overall effect of errors in economic assumptions on recent estimates of federal revenues has been relatively small. See Congressional Budget Office, The Economic and Budget Outlook: Fiscal Years 2000-2009 (January 1999), pp. 50-51.