Description1. David Neumark and William Wascher published a study in 19921 of the effect of
minimum wages on teenage employment using a U.S. state panel. The paper used
annual observations for 19771989 and included all 50 states plus the District of
Columbia. The estimated equation is of the following type:
Eit = β0 + β1 (Mit /Wit ) + γ2D2i + • • • + γnD51i + δ2B2t + • • • + δTB13t + uit
where E is the employmenttopopulation ratio of teenagers, M is the nominal minimum
wage, and W is average wage in the state. In addition, other explanatory variables, such
as the primeage male unemployment rate, and the teenage population share were
included. D2D51 are the state dummies, B2B13 are the time dummies.
(a) Briefly discuss the advantage of using panel data rather than pure crosssections or
time series in this situation.
(b) Estimating the model by OLS but including only time dummy variables results in the
following output
2
Eit = β0 – 0.33 × (Mit /Wit ) + 0.35 × (SHY it ) – 1.53 × uram it ; R =.20
(.08)
(.28)
(.13)
where SHY is the proportion of teenagers in the population, and uram is the primeage
male unemployment rate. Coefficients for the timefixed effects are not reported.
Numbers in parentheses are homoskedasticityonly standard errors.
Comment on the above results. Are the coefficients statistically significant? Since these
are level regressions, how would you calculate elasticities?
(c) Adding state fixed effects changes the above equation as follows:
2
Eit = β0 + 0.07 × (Mit /Wit ) – 0.19 × (SHY it ) – 0.54 × uram it ; R = 0.69
(0.10)
(0.22)
(0.11)
Compare the two results. Why would including state dummy variables change the
coefficients this way?
2
(d) The significance of each coefficient decreased, yet R increased. How is that
possible? What does this result tell you about testing the hypothesis that all statefixed
effects can be restricted to the same coefficient? How would you test for such a
hypothesis?
1
Neumark, D., & Wascher, W. (1992). Employment effects of minimum and subminimum wages: panel
data on state minimum wage laws. ILR Review, 46(1), 5581.
2. Using the dataset called panel_hw.dta. This dataset examines voter turnout in 49
US states (Louisiana is omitted because of an unusual election in 1982) plus the
District of Columbia over 11 elections (contains data on 50 units over 11 time
periods). You regressed turnout as a percentage of voting age population on the
number of days before the general election by which an individual needs to register,
state per capita income, the dummy variable for midterm elections, and the dummy
variables for West North Central, the South, and the Border states. The output is
given below
. regress vaprate gsp midterm regdead WNCentral South Border
Source 
SS
df
MS
————+—————————–Model  38318.2811
6 6386.38019
Residual  19045.9213
543 35.0753616
————+—————————–Total  57364.2025
549 104.488529
Number of obs =
F( 6,
543) =
Prob > F
=
Rsquared
=
Adj Rsquared =
Root MSE
=
550
182.08
0.0000
0.6680
0.6643
5.9224
—————————————————————————–vaprate 
Coef.
Std. Err.
t
P>t
[95% Conf. Interval]
————+—————————————————————gsp  .0382637
.0178458
2.14
0.032
.073319
.0032085
midterm  13.37857
.5072368
26.38
0.000
14.37495
12.38218
regdead  .2171977
.0333501
6.51
0.000
.2827088
.1516867
WNCentral 
3.736791
.799348
4.67
0.000
2.166598
5.306984
South  7.969966
.7031458
11.33
0.000
9.351186
6.588747
Border  7.035122
.8146468
8.64
0.000
8.635368
5.434877
_cons 
61.87341
1.02298
60.48
0.000
59.86393
63.88289
—————————————————————————–. test
WNCentral South Border
( 1)
( 2)
( 3)
WNCentral = 0
South = 0
Border = 0
F(
3,
543) =
Prob > F =
73.05
0.0000
a. Which coefficients are significant? Are there any regional effects of these
regions? Use Ftest to determine this.
b. Part (i) assumed that pooling the data was valid. Instead, estimate this with a
fixed effects regression. Which variables are omitted from the estimation? Why?
The output is given below
. iis stcode
. tis year
. xtreg vaprate midterm gsp regdead WNCentral South Border, fe
Fixedeffects (within) regression
Group variable: stcode
Number of obs
Number of groups
=
=
550
50
Rsq:
Obs per group: min =
11
within
= 0.7363
between = 0.0096
overall = 0.4275
corr(u_i, Xb)
= 0.0023
F(2,498)
Prob > F
avg =
max =
11.0
11
=
=
695.15
0.0000
—————————————————————————–vaprate 
Coef.
Std. Err.
t
P>t
[95% Conf. Interval]
————+—————————————————————midterm  13.37218
.3586782
37.28
0.000
14.07689
12.66747
gsp  .0246353
.0171186
1.44
0.151
.058269
.0089983
regdead  (dropped)
WNCentral  (dropped)
South  (dropped)
Border  (dropped)
_cons 
54.67635
.5900227
92.67
0.000
53.51711
55.83559
————+—————————————————————sigma_u  6.6883066
sigma_e 
4.187412
rho  .71840339
(fraction of variance due to u_i)
—————————————————————————–F test that all u_i=0:
F(49, 498) =
12.00
Prob > F = 0.0000
. estimates store fe
c. Now estimate a randomeffects model. Why do those results differ from the fixed
effects results? Is there evidence of unobserved heterogeneity? Show your
testing hypothesis, and decide on it. Are some variables omitted from the
estimation? Why or why not?
. xtreg vaprate midterm gsp regdead WNCentral South Border, re
Randomeffects GLS regression
Group variable: stcode
Number of obs
Number of groups
=
=
550
50
Rsq:
Obs per group: min =
avg =
max =
11
11.0
11
within = 0.7363
between = 0.5741
overall = 0.6677
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Wald chi2(6)
Prob > chi2
=
=
1452.03
0.0000
—————————————————————————–vaprate 
Coef.
Std. Err.
z
P>z
[95% Conf. Interval]
————+—————————————————————midterm  13.37301
.358402
37.31
0.000
14.07547
12.67056
gsp  .0264047
.0165908
1.59
0.111
.0589221
.0061126
regdead  .2183449
.0859471
2.54
0.011
.3867981
.0498918
WNCentral 
3.772746
2.058301
1.83
0.067
.2614497
7.806942
South  7.904291
1.798542
4.39
0.000
11.42937
4.379214
Border 
7.07554
2.096786
3.37
0.001
11.18516
2.965916
_cons 
61.51503
2.225863
27.64
0.000
57.15242
65.87764
————+—————————————————————sigma_u  4.4345407
sigma_e 
4.187412
rho 
.5286392
(fraction of variance due to u_i)
—————————————————————————–. estimates store re
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects
vaprate[stcode,t] = Xb + u[stcode] + e[stcode,t]
Estimated results:

Var
sd = sqrt(Var)
———+—————————vaprate 
104.4885
10.22196
e 
17.53442
4.187412
u 
19.66515
4.434541
Test:
Var(u) = 0
chi2(1) =
Prob > chi2 =
673.91
0.0000
c. Which model is more appropriate, FE or RE? What is your underlying
testing hypothesis? What implication does the null hypothesis have?
Discuss the tradeoffs between using pooled OLS, fixedeffects, and
random effects for this model.
. hausman fe re
— Coefficients —
(b)
(B)
(bB)
sqrt(diag(V_bV_B))

fe
re
Difference
S.E.
————+—————————————————————midterm 
13.37218
13.37301
.000829
.0140729
gsp 
.0246353
.0264047
.0017694
.0042182
—————————————————————————–b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test:
Ho:
difference in coefficients not systematic
chi2(2) = (bB)'[(V_bV_B)^(1)](bB)
=
0.18
Prob>chi2 =
0.9158
Objectives for Chapter 10 and 11
• There are potential pitfalls inherent in using regression with time series data that are not present for crosssectional applications.
• Trends, seasonality, and high persistence are ubiquitous in time series data.
• Time series applications are different from crosssectional ones.
• Static and finite distributed lag models at the same time have a shot at satisfying the GaussMarkov
assumptions, so we learn about these models
• Chapter 11 shows that only contemporaneous exogeneity is needed for consistency.
• We learn how to compute goodnessoffit measures with a trending or seasonal dependent variable.
• While detrending or deseasonalizing y is hardly perfect (and does not work with integrated processes), it is better than simply reporting
the very high Rsquareds that often come with time series regressions with trending variables.
• In chapter 11, weakly dependent process and an integrated process are covered using the MA(1) and the stable
AR(1). MA(1) is often hard to describe intuitively, so it can be hard to fully master.
• We cover the idea of a random walk process because it is a good example of a unit root (highly persistent)
process.
• The issue comes down to whether or not to first difference the data before specifying the linear model. This can be rather debatable
Basic
Regression
Analysis with
Time Series
Data
Time series data are data collected on the same
observational unit at multiple time periods
• Aggregate consumption and GDP for a country (for
example, 20 years of quarterly observations = 80
observations)
• Yen/$, pound/$ and Euro/$ exchange rates (daily data for
1 year = 365 observations)
• Cigarette consumption per capital for a state
• The inflation rate of the United States (for example, 20
years of monthly observations = 240 observations)
2
Why use time series data?
Basic Regression
Analysis with
Time Series Data
• To develop forecasting models
• What will the rate of inflation be next year?
• To estimate dynamic causal effects
• If the Fed increases the Federal Funds rate
now, what will be the effect on the rates of
inflation and unemployment in 3 months? in
12 months?
• What is the effect over time on cigarette
consumption of a hike in the cigarette tax
• Plus, sometimes you do not have any choice…
• Rates of inflation and unemployment in the
US can be observed only over time.
3
Basic Regression
Analysis with
Time Series Data
What are the applications of time series?
Everywhere when data are observed in a time ordered
fashion.
For example:
• Economics: daily stock market quotations or monthly
unemployment rates.
• Social sciences: population series, such as birthrates or
school enrollments.
• Epidemiology: the number of COVID19 cases observed
over some time period.
• Medicine: blood pressure measurements traced over
time for evaluating drugs.
• Global warming?
4
Basic Regression Analysis with Time Series
Data
• Example: US inflation 19601999
• Here, there are only two time
series. There may be many more
variables whose paths over time
are observed simultaneously.
• Time series analysis focuses on
modeling the dependency of a
variable on its own past, and on
the present and past values of
other variables.
5
Basic
Regression
Analysis with
Time Series
Data
Time series data raises new technical issues
• •
Time lags
• •
Correlation over time (serial correlation or
autocorrelation)
• •
Forecasting models that have no causal
interpretation (specialized tools for forecasting):
• o
autoregressive (AR) models
• o
autoregressive distributed lag (ADL) models
• •
Conditions under which dynamic effects can be
estimated, and how to estimate them
• •
Calculation of standard errors when the errors are
serially correlated becomes a problem like in crosssectional
regression
6
Basic
Regression
Analysis with
Time Series
Data
Time series and estimation of causal effects are quite
different objectives.
•
For forecasting,
• o
Adjusted R square matters (a lot!)
• o
Omitted variable bias is not a problem!
• o
We will not worry about interpreting coefficients in
time series and forecasting like in crosssectional models
• o
External validity is paramount: the model estimated
using historical data must hold into the (near) future
7
Basic Regression Analysis with Time Series
Data
We will transform time series variables using lags, first differences, logarithms, & growth rates
8
Basic Regression Analysis with Time Series Data
•
We will transform time series variables using lags, first differences, logarithms, & growth rates
•
•
9
Basic Regression Analysis with Time Series
Data
Example: Quarterly rate of inflation at an annual rate
•
CPI in the first quarter of 1999 (1999:I) = 164.87
•
CPI in the second quarter of 1999 (1999:II) = 166.03
•
Percentage change in CPI, 1999:I to 1999:II
166.03 − 164.87
1.16
= 100
= 100
= 0.703%
164.87
164.87
10
Basic Regression Analysis with Time Series
Data
• Examples of time series regression models
• Static models
• In static time series models, the current value of one variable is modeled as the
result of the current values of explanatory variables
• Examples for static models
11
Basic Regression Analysis with Time Series
Data
• Finite distributed lag models
• In finite distributed lag models, the explanatory variables are allowed to
influence the dependent variable with a time lag.
• Example for a finite distributed lag model
• The fertility rate may depend on the tax value of a child, but for biological and
behavioral reasons, the effect may have a lag.
12
Basic Regression Analysis with Time Series
Data
• Interpretation of the effects in finite distributed lag models
• Effect of a past shock on the current value of the dep. variable
13
Basic Regression Analysis with Time Series
Data
• Graphical illustration of lagged effects
• The effect is biggest after a lag
of one period. After that, the
effect vanishes (if the initial
shock was transitory).
• The long run effect of a
permanent shock is the
cumulated effect of all relevant
lagged effects. It does not
vanish (if the initial shock is a
permanent one).
14
Basic Regression Analysis with Time Series
Data
• Finite sample properties of OLS under classical assumptions
• Assumption TS.1 (Linear in parameters)
• Assumption TS.2 (No perfect collinearity)
“In the sample (and therefore in the underlying time series process), no
independent variable is constant nor a perfect linear combination of the others.”
15
Basic Regression Analysis with Time Series
Data
• Notation
• Assumption TS.3 (Zero conditional mean)
16
Basic Regression Analysis with Time Series
Data
• Discussion of assumption TS.3
Exogeneity in time series regression:
Exogeneity (past and present)
X is exogenous if E(utXt,Xt–1,Xt–2,…) = 0.
Hard
assumption
to make
***Strict Exogeneity (past, present, and future)
X is strictly exogenous if E(ut…,Xt+1,Xt,Xt–1, …) = 0
Notes:
• Strict exogeneity implies exogeneity
• For now we suppose that X is exogenous
• If X is exogenous then OLS estimates the dynamic causal effect on Y of a change in X.
17
Basic Regression Analysis with Time Series
Data
• Theorem 10.1 (Unbiasedness of OLS)
• Assumption TS.4 (Homoscedasticity)
• A sufficient condition is that the volatility of the error is independent of the
explanatory variables and that it is constant over time.
• In the time series context, homoscedasticity may also be easily violated, e.g. if
the volatility of the dependent variable depends on regime changes.
18
Basic Regression Analysis with Time Series
Data
• Assumption TS.5 (No serial correlation)
• Discussion of assumption TS.5
• Why was such an assumption not made in the crosssectional case?
• The assumption may easily be violated if, conditional on knowing the values
of the indep. variables, omitted factors are correlated over time.
• The assumption may also serve as substitute for the random sampling
assumption if sampling a crosssection is not done completely randomly.
• In this case, given the values of the explanatory variables, errors have to be
uncorrelated across crosssectional units (e.g. states).
19
Basic Regression Analysis with Time Series
Data
• Theorem 10.2 (OLS sampling variances)
Under assumptions TS.1 – TS.5:
• Theorem 10.3 (Unbiased estimation of the error variance)
20
Basic Regression Analysis with Time Series
Data
• Theorem 10.4 (GaussMarkov Theorem)
• Under assumptions TS.1 – TS.5, the OLS estimators have the minimal variance
of all linear unbiased estimators of the regression coefficients.
• This holds conditional as well as unconditional on the regressors.
• Assumption TS.6 (Normality)
• Theorem 10.5 (Normal sampling distributions)
• Under assumptions TS.1 – TS.6, the OLS estimators have the usual normal
distribution (conditional on X). The usual F and ttests are valid.
21
Basic Regression Analysis with Time Series
Data
• Example: Static Phillips curve
• Discussion of CLM assumptions
• TS.1: The error term contains factors such as monetary shocks,
income/demand shocks, oil price shocks, supply shocks, or exchange rate
shocks.
• TS.2: A linear relationship might be restrictive, but it should be a good
approximation. Perfect collinearity is not a problem as long as unemployment
varies over time.
22
Basic
Regression
Analysis with
Time Series
Data
• According to the “Phillips curve” says that if
unemployment is above its equilibrium, or “natural,” rate,
then the rate of inflation will increase.
• The coefficient on the unemployment should be negative
• The rate of unemployment at which inflation neither
increases nor decreases is often called the “nonaccelerating rate of inflation” unemployment rate: the
NAIRU
• Is this relation found in US economic data? (This makes a
nice homework exercise)
• Can this relation be exploited for forecasting inflation?
23
Basic Regression Analysis with Time Series
Data
• Discussion of CLM assumptions (cont.)
24
Basic Regression Analysis with Time Series
Data
• Example: Effects of inflation and deficits on interest rates
• Discussion of CLM assumptions
• TS.1: The error term represents other factors that determine interest rates in
general, e.g. business cycle effects.
• TS.2: A linear relationship might be restrictive, but it should be a good
approximation. Perfect collinearity will seldomly be a problem in practice.
25
Basic Regression Analysis with Time Series
Data
• Discussion of CLM assumptions (cont.)
26
Basic Regression Analysis with Time Series
Data
• Using dummy explanatory variables in time series
• Interpretation
• During World War II, the fertility rate was temporarily lower.
• It has been permanently lower since the introduction of the pill in 1963.
27
Basic Regression Analysis with Time Series
Data
• Time series with trends– it is best to do DDD when have time series data
• Example for a time series with a linear upward trend:
28
Basic Regression Analysis with Time Series
Data
• Modelling a linear time trend
• Modelling an exponential time trend
29
Basic Regression Analysis with Time Series
Data
• Example for a time series with an exponential trend as a DDD
• Abstracting from random deviations, the time series has a constant growth rate
30
Basic Regression Analysis with Time Series
Data
• Using trending variables in regression analysis
• If trending variables are regressed on each other, a spurious relationship may
arise if the variables are driven by a common trend.
• In this case, it is important to include a trend in the regression.
• Example: Housing investment and prices
31
Basic Regression Analysis with Time Series
Data
• Example: Housing investment and prices (cont.)
• When should a trend be included?
• If the dependent variable displays an obvious trending behavior
• If both the dependent and some independent variables have trends
• If only some of the independent variables have trends; their effect on the
dependent variable may only be visible after a trend has been substracted
32
Basic Regression Analysis with Time Series
Data
• A detrending interpretation of regressions with a time trend
• It turns out that the OLS coefficients in a regression including a trend are the
same as the coefficients in a regression without a trend but where all the
variables have been detrended before the regression.
• This follows from the general interpretation of multiple regressions.
• Computing Rsquared when the dependent variable is trending
• Due to the trend, the variance of the dependent variable will be overstated.
• It is better to first detrend the dependent variable and then run the
regression on all the independent variables (plus a trend if they are trending
as well).
• The Rsquared of this regression is a more adequate measure of fit.
33
Further Issues in Using OLS with Time Series
Data
• The main point of chapter 11: Examine the circumstances which permit us to use
OLS, that is under which situations would our GaussMarkov Assumptions for our
time series data would be reasonable
• The assumptions used so far seem to be too restricitive
• Strict exogeneity, homoskedasticity, and no serial correlation are very demanding
requirements, especially in the time series.
• Statistical inference rests on the validity of the normality assumption
• Much weaker assumptions are needed if the sample size is large.
• A key requirement for large sample analysis of time series is that the time series in question
are stationary and weakly dependent.
• Stationary time series
• Loosely speaking, a time series is stationary if its stochastic properties and its temporal
dependence structure do not change over time.
• Stationarity says that the past is like the present and the future, at least in a probabilistic sense.
• In general it is difficult to tell whether a process is stationary given the definition above, but we do know
that seasonal and trending data are not stationary.
34
Further Issues in Using OLS with Time Series
Data
Stationary stochastic processes
A stochastic process {xt: t = 1,2,…,} is stationary, if for every collection
of indices 1 ≤ t1 ≤ t2 ≤ … ≤ tm the joint distribution of (x1t, x2t,…,xmt) is
the same as that of (x1t+h, x2t+h, …,xmt+h) for all integers h ≥ 1.
Point of Stationary Stochastic Process: To analyze a statistical time
series, it must be assumed that the structure of the stochastic process
which generates the observations is essentially invariant through time!
35
Further Issues in Using OLS with Time Series
Data
• Weakly dependent time series
Weak dependence relates to how strongly related xt , and xt+h is related to
each other.
The assumption of weak dependence in a sense constrains the degree
of dependence, and says that the correlation between the same
independent variable
• Discussion of the weak dependence property
• An implication of weak dependence is that the correlation between xt and xt+h must
converge to zero if h grows to infinity.
• For the law of large numbers and the central limit theorem to hold, the individual
observations must not be too strongly related to each other; in particular their
relation must become weaker (and this fast enough) the farther they are apart.
• Note that a series may be nonstationary but weakly dependent.
36
Further Issues in Using OLS with Time Series
Data
Nonetheless, sometimes we can rely on a weaker form of stationarity
call Covariance Stationarity. A stochastic process is covariance
stationary if,
37
Further Issues in Using OLS with Time Series
Data
Examples for weakly dependent time series
• Moving average process of order one [MA(1)]
for t = 1, 2, …, and where et is a i.i.d. sequence with a zero mean and variance σ2
Note that we can in general describe a moving average process for any variable, be it the dependent on
the independent variable, or even the errors. We will use the above form for the moment, i.e. for the
independent variable
When a random variable follows the above process, we describe it as “xt follows a moving average
process of order one˝. Basically, the process is a weighted average of et and et−1. This is an example of a
weakly dependent process because of the following reasons
38
Further Issues in Using OLS with Time Series
Data
Of large numbers and the central limit theorem still applies
39
Further Issues in Using OLS with Time Series
Data
• Autoregressive process of order one [AR(1)]
40
Further Issues in Using OLS with Time Series
Data
41
Further Issues in Using OLS with Time Series
Data
Then as long as the above five assumptions hold, the OLS estimators are
asymptotically normally distributed, and the usual OLS standard errors, t
statistics, F statistics, and LM statistics are asymptotically valid.
42
Further Issues in Using OLS with Time Series
Data
• Theorem 11.1 (Consistency of OLS)
• Why is it important to relax the strict exogeneity assumption?
• Strict exogeneity is a serious restriction because it rules out all kinds of dynamic
relationships between explanatory variables and the error term.
• In particular, it rules out feedback from the dependent variable on future values of
the explanatory variables (which is very common in economic contexts).
• Strict exogeneity precludes the use of lagged dependent variable as regressors.
43
Further Issues in Using OLS with Time Series
Data
• Why do lagged dependent variables violate strict exogeneity?
• This leads to a contradiction because:
• OLS estimation in the presence of lagged dependent variables
• Under contemporaneous exogeneity, OLS is consistent but biased.
44
Further Issues in Using OLS with Time Series
Data
• Using trendstationary series in regression analysis
• Time series with deterministic time trends are nonstationary.
• If they are stationary around the trend and in addition weakly dependent,
they are called trendstationary processes.
• Trendstationary processes also satisfy assumption TS.1‘
• Using highly persistent time series in regression analysis
• Unfortunately many economic time series violate weak dependence because
they are highly persistent (= strongly dependent).
• In this case OLS methods are generally invalid (unless the CLM hold).
• In some cases transformations to weak dependence are possible.
45
Further Issues in Using OLS with Time Series
Data
• Random walks
46
Further Issues in Using OLS with Time Series
Data
• Examples for random walk realizations
47
Further Issues in Using OLS with Time Series
Data
• A DDD of Threemonth Tbill rate as a possible example for a random walk
• A random walk is a special case of a unit root
process.
• Unit root processes are defined as a random
walk, but et may be an arbitrary weakly
dependent process.
• From an economic point of view it is
important to know whether a time series is
highly persistent. In highly persistent time
series, shocks or policy changes have
lasting/permanent effects, in weakly
dependent processes their effects are
transitory.
48
Further Issues in Using OLS with Time Series
Data
• Random walks with drift
49
Further Issues in Using OLS with Time Series
Data
• Sample path of a random walk with drift
50
Further Issues in Using OLS with Time Series
Data
• Transformations on highly persistent time series
• Order of integration
• Weakly dependent time series are integrated of order zero (= I(0))
• If a time series has to be differenced one time in order to obtain a weakly dependent
series, it is called integrated of order one (= I(1))
• Examples for I(1) processes
• Differencing is often a way to achieve weak dependence.
51
Further Issues in Using OLS with Time Series
Data
• Deciding whether a time series is I(0) or I(1) process?
• The test used for examining unit root is call the Dickey Fuller Tests and others using an AR(1)
• Then we have a unit root, ρ should be very close to 1.
• However, we can use the law of large numbers only in the situation where ρ < 1, and even
when the latter is true, you would recall that the estimator for ρ is only consistent, but is not
unbiased.
• You can imagine that if ρ were really a unit root process, the sampling distribution would be
quite different, rendering the estimates of ρ very imprecise. There are no hard and fast rules,
but some economist would perform first differencing if ρb is greater than 0.9, while others
would do so when it is greater than 0.8.
• Point: Some economists think that every macroeconomic time series has a unit root—an
ongoing debate.
• Also, if you know the time series data has a trend, it is sensible to detrend the series before
using the data, since trends would affect the bias of the estimate for ρ, or more accurately
increases the likelihood of you finding a ρ near one, since trends creates a positive bias.
52
Further Issues in Using OLS with Time Series
Data
• Example: Wages and productivity
It turns out that even after detrending, both series display sample autocorrelations close to one so
estimating the equation in first differences seems more adequate:
53
Further Issues in Using OLS with Time Series
Data
• Summary of Chapter 10 and 11
• There are potential pitfalls inherent in using regression with time series data that are not present for crosssectional applications– we saw several examples of this
• The gist of these two chapters is to be able to estimate time series models using OLS. It is different from
crosssectional models
• When developing time series models, make sure the trend etc are removed. Leaving trend etc in the
regression could cause spurious regressions
• Unlike cross sectional regressions, we do not interpret the coefficients of the time series models
• Keep in mind that the first step when doing time series, is to draw a DDD
• Need to understand the qualitative properties, e.g, trends, seasonality, cycles etc
• Once understand the qualitative properties, the appropriate model can be constructed
54
Chapter 12
Serial Correlation and Heteroskedasticity in
Time Series Regressions
1
Objectives of Chapter 12
• Most of this chapter deals with serial correlation, but it also explicitly considers heteroscedasticity in time
series regressions.
• The first section reviews what assumptions were needed to obtain both finite sample and asymptotic
results.
2
• Just as with heteroscedasticity, serial correlation itself does not invalidate Rsquared.
• If the data are stationary and weakly dependent, Rsquared and adjusted Rsquared consistently
estimate the population Rsquared (which is welldefined under stationarity).
• The Durbin Watson test is presented (very old test for serial correlation) but the DW test is less useful
than the t test.
• We are introduced to robust standard errors as many econometrics packages routinely compute fully
robust standard errors in time series
• One important point is that ARCH is heteroscedasticity and not serial correlation, something that is
confusing in many texts. If a model contains no serial correlation, the usual heteroscedasticityrobust
statistics are valid. ARCH is used in finance quite a lot
Serial Correlation and Heteroscedasticity in Time Series Regressions
Autocorrelation
• In the classical regression model, it is assumed that E(utus) = 0 if t is
not equal to s .
• What happens when this assumption is violated? That is what we
have been looking at in the past few weeks
Firstorder autocorrelation
Yt = o + 1 X1t + 2 X 2t +
where:
+ k X kt + ut
ut = ut −1 + t
E ( t ) = 0
E ( t s ) = 0 for t s
E ( ) =
2
t
1
2
Positive firstorder autocorrelation ( > 0)
Negative firstorder autocorrelation ( 0)
Incorrect model specification and apparent autocorrelation
Violation of assumption of classical regression
model
E (ut us ) = E[( ut −1 + t )(ut −1 )]
= E ( u
2
t −1
+ ut −1 t )
= 0
2
u
corr(ut ut −1 ) =
Serial Correlation and Heteroscedasticity in Time Series Regressions
• Properties of OLS with serially correlated errors
• OLS still unbiased and consistent if errors are serially correlated.
• Correctness of Rsquared also does not depend on serial correlation.
• OLS standard errors and tests will be invalid if there is serial
correlation.
• OLS will not be efficient anymore if there is serial correlation.
Lags are variable
9
Serial Correlation and Heteroscedasticity in Time Series Regressions
• Serial correlation and the presence of lagged dependent variables
• Is OLS inconsistent if there are serial correlation and lagged
dependent variables?
• No: Including enough lags so that TS.3‘ holds guarantees consistency.
• Including too few lags will cause an omitted variable bias problem and
serial correlation because some lagged dependent variables end up in
the error term.
10
Serial Correlation and Heteroscedasticity in Time Series Regressions
• Correcting for serial correlation with strictly exogenous regressors
• Under the assumption of AR(1) errors, one can transform the model so that it
satisfies all GMassumptions. For this model, OLS is BLUE.
• Problem: The AR(1)coefficient is not known and has to be estimated
11
AR(1) correction: known
OLS
Error term
that is AR(1)
Yt = o + 1 X 1t + 2 X 2t +
+ k X kt + ut
where: ut = ut −1 + t
Lagging this relationship 1 period:
Lag by 1
Yt −1 = o + 1 X 1t −1 + 2 X 2t −1 +
+ k X kt −1 + ut −1
Multiplying this by 
Multiply by –P
P is sort of known
Can’t estimate
so transform it
− Yt −1 = − o − 1 X 1t −1 − 2 X 2t −1 −
− k X kt −1 − ut −1
With a little bit of algebra:
Yt − Yt −1 = (1 − ) o + 1 ( X1t − X1t −1 ) + 2 ( X 2t − X 2t −1 ) +
+ k ( X kt − X kt −1 ) + ( ut − ut −1 )
Yt − Yt −1 = (1 − ) o + 1 ( X1t − X1t −1 ) + 2 ( X 2t − X 2t −1 ) +
+ k ( X kt − X kt −1 ) + t
AR(1) correction: known
• Solution?
• quasidifference each variable:
Yt = Yt − Yt −1
X 1t = X 1t − X 1t −1
X 2 t = X 2 t − X 2 t −1
X kt = X kt − X kt −1
Then estimate the following regression
Yt = o + 1 X1t + 2 X 2t +
+ k X kt + t
AR(1) correction: known
• This procedure provides unbiased and consistent
estimates of all model parameters and standard
errors.
• If = 1, a unit root is said to exist. In this case,
quasidifferencing is equivalent to differencing:
Yt = o + 1X 1t + 2X 2t +
+ k X kt + t
Serial Correlation and Heteroscedasticity in Time Series Regressions
• Correcting for serial correlation (contd)
• Replacing the unknown AR(1) coefficient ρ with an estimated value leads to
an FGLS estimator.
• There are two variants:
• CochraneOrcutt estimation omits the first observation
• PraisWinsten estimation adds a transformed first observation
• In smaller samples, PraisWinsten estimation should be more efficient
• Comparing OLS and FGLS with autocorrelation
• For consistency of FGLS more than TS.3‘ is needed (e.g. TS.3) because the
transformed regressors include variables from different periods.
• If OLS and FGLS differ dramatically this might indicate violation of TS.3.
15
AR(1) correction: unknown
CochraneOrcutt procedure:
1. Estimate the original model using OLS. Save the error terms
2. Regress saved error term on lagged error term (without a constant) to
estimate
Use algorithm to
3. Estimate a quasidifferenced version of original model. Use the estimated
keep estimating the
parameters to generate new estimate of error term.
modely to find some
optimal P
4. Go to step 2. Repeat this process until change in parameter estimates
become less than selected threshold value.
This results in unbiased and consistent estimates of all model parameters and
standard errors.
PraisWinsten estimator
• CochraneOrcutt method involves the loss of 1 observation.
• PraisWinsten estimator is similar to CochraneOrcutt method, but
applies a different transformation to the first observation
• Monte Carlo studies indicate substantial efficiency gain from the use
of the PraisWinsten estimator as opposed to the CochraneOrcutt
method)
HildrethLu estimator
• The HildrethLu procedure is a more direct method for estimating ρ. After
establishing that the errors have an AR(1) structure,
• A grid search algorithm – helps ensure that the estimator reaches a global
minimum sum of squared error terms rather than a local minimum sum of
squared error terms
Follow these steps to do Hildreth Lu:
• Select a series of candidate values for ρ (presumably values that would
make sense after you assessed the pattern of the errors).
• For each candidate value, regress y∗ on the transformed predictors using
the transformations established in the CochraneOrcutt procedure. Retain
the SSEs for each of these regressions.
• Select the value which minimizes the SSE as an estimate of ρ .
Serial Correlation and Heteroscedasticity in Time Series Regressions
• FGLS Estimation of the AR(1) Model
• Under the assumption of strict exogeneity, we can correct for serial
correlation using FGLS.
With an estimate of the AR(1) parameter, we can transform the original model:
19
Serial Correlation and Heteroscedasticity in Time Series Regressions
• Serial correlationrobust inference after OLS
• In the presence of serial correlation, OLS standard errors overstate statistical
significance because there is less independent variation.
• One can compute serial correlationrobust standard errors after OLS.
• This is useful because FGLS requires strict exogeneity and assumes a very
specific form of serial correlation (AR(1) or, generally, AR(q)).
• Serial correlationrobust standard errors:
• Serial correlationrobust F and ttests are also available.
20
Serial Correlation and Heteroscedasticity in Time Series Regressions
• How to compute NeweyWest Standard Errors
• The choice of the integer g is open to debate.
21
Serial Correlation and Heteroscedasticity in Time Series Regressions
• Discussion of serial correlationrobust standard errors
• The formulas are also robust to heteroscedasticity; they are therefore
called “heteroscedasticity and autocorrelation consistent” (=HAC).
• HAC SEs lagged in use behind heteroskedasticity robust errors for several
reasons.
• We generally have more observations with crosssections than for time series.
• NeweyWest SEs can be poorly behaved if there is substantial serial correlation and
the sample size is small.
• The bandwidth g must be chosen by the researcher and the SEs can be sensitive to
the choice of g.
• That said, HAC SE’s are now widespread in use.
22
Serial Correlation and Heteroscedasticity in Time Series Regressions
• Testing for serial correlation
• Testing for AR(1) serial correlation with strictly exogenous regressors
• Example: Static Phillips curve (see above)
Test statistic is very large or greater than 3.06 so
population should be p
Purchase answer to see full
attachment
We offer the bestcustom writing paper services. We have done this question before, we can also do it for you.
Why Choose Us
 100% nonplagiarized Papers
 24/7 /365 Service Available
 Affordable Prices
 Any Paper, Urgency, and Subject
 Will complete your papers in 6 hours
 Ontime Delivery
 Moneyback and Privacy guarantees
 Unlimited Amendments upon request
 Satisfaction guarantee
How it Works
 Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
 Fill in your paper’s requirements in the "PAPER DETAILS" section.
 Fill in your paper’s academic level, deadline, and the required number of pages from the dropdown menus.
 Click “CREATE ACCOUNT & SIGN IN” to enter your registration details and get an account with us for recordkeeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page.
 From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.