CS计算机代考程序代写 AI finance Panel Data and Diff-in-Diff
Panel Data and Diff-in-Diff
Chris Hansman
Empirical Finance: Methods and Applications
January 25-26, 2020
1/76
Some Details
First assignment released this week Posted on January 26th
Due on February 9th.
2/76
Overview
Last class: an introduction to causality
This class: estimating causal effects with panel data 1. An introduction to panel data
Multiple observations of the same unit over time 2. First difference and fixed effects estimators
Estimating causal effects with fixed omitted variables
3. Difference-in-difference estimators
A more robust method for estimating causal effects
3/76
Part 1: Introducing Panel Data
Three common types of data 1. Cross-sectional
2. Time-series 3. Panel
Estimating unit and time specific averages
4/76
Three common types of data:
(1) Cross-Sectional
A single observation for each unit i in {1,2,··· ,N}
e.g. test scores and study times for each individual in the class (2) Time Series
Repeated observations from time t = 1, · · · , T for a single unit e.g. yearly GDP and unemployment in the UK
(3) Panel
Repeated observations over time for multiple units
e.g. monthly market cap and leverage for every firm in the S&P
5/76
Cross-sectional data: One observation per unit
6/76
Time series data: One unit over time
7/76
Panel data: Multiple units followed over time
8/76
Panel data: Notation
Panel data consists of observations of the same n units in T different periods
If the data contains variables x and y, we write them (xit,yit)
fori=1,···,N
i denotes the unit, e.g. Microsoft or Apple
andt=1,···,T
t denotes the time period, e.g. September or October
9/76
Panel data: Multiple units followed over time
10/76
Panel data: Allows Averaging Within Units
Because we see every unit multiple times: Can take unit specific averages
pricei = ∑Tt=1 priceit T
Because we see many units at the same time period Can take time specific averages:
pricet = ∑Ni=1 priceit N
The overall average is (of course):
price = ∑Tt=1 ∑Ni=1 priceit N×T
11/76
Panel data: Unit Specific Averages
12/76
Panel data: Time Specific Averages
13/76
Calculating Unit Specific Averages With Regression
Recall that dummy variables let you calculate these means
Create dummy variables for each i (e.g. Company) omitting 1
Lets call them D1,D2,··· ,DN
And consider the following regression
N−1
yit = β0 + ∑ δi Di + vit
i=1 Recall that we can then estimate
Average for the omitted unit: βˆ0
Average for any other i: βˆ +δˆ 0i
14/76
Residualizing to Remove Differences in Means
N−1
yit = β0 + ∑ δi Di + vit
i=1
After estimating this regression, we can also compute the residuals: N−1
vˆ=y−βˆ− δˆD it it 0 ∑ii
i=1
For any given i, this translates to: vˆ = y −βˆ −δˆ
This is just yit −y ̄i
The price minus the unit specific average
Lets us compare changes over time Putting aside level differences
it it 0 i
15/76
Residualizing Removes Group Specific Means
16/76
Residualizing Removes Group Specific Means
16/76
Residualizing Removes Group Specific Means
16/76
Calculating Time Specific Averages With Regression
Can similarly calculate average for each time period with regression Create dummy variables for each t (e.g. Dec. 15) omitting 1
Lets call them D1,D2,··· ,DT
And consider the following regression
T−1
yit = β0 + ∑ τt Dt + vit
t=1 Recall that we can then estimate
Average for the omitted unit: βˆ0
Average for any other i: βˆ +τˆ 0t
17/76
Part 2: Advantages of Panel Data for Causal Effects
A simple approach using a panel: event study
Two approaches to dealing with a fixed omitted variables
First differences Fixed effects
18/76
A simple panel approach: Before vs. after
Suppose we are interested in the causal effect of a particular event or policy
yit = β0 + β1 AfterEventit + vi
Example: Impact of Brexit on UK firms Can we simply compare?
E [yit |Afterevent = 1] − E [yit |Afterevent = 0]
19/76
A simple panel approach: Before vs. after
Y
2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)
20/76
A simple panel approach: Before vs. after
E[Y|Before]
Y
2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)
E[Y|After]
20/76
Before vs. after an event used frequently
This tactic underlies an approach called event study Lots of different techniques/bells and whistles
Chapter 4 of The Econometrics of Financial Markets (Cambell, Lo and MacKinlay) if you want more detail
21/76
Entrance into the S&P (Shleifer,1986; Harris and Gurel, 1986)
Source: Gompers, Greenwood, and Lerner’s Lecture Notes
21/76
When is an event study ineffective?
E[Y|Before]
Y
2016m1 2016m4 2016m7 2016m10 Month (t)
2017m1
E[Y|After]
21/76
Panel Data and Omitted Variables
We will come back to this before vs. after strategy in a bit Lets reconsider our omitted variables problem:
yit =β0+β1xit+γai+eit Suppose we see xit and yit but not ai
Suppose Corr(xit,eit) = 0 but Corr(ai,xi) ̸= 0
Note that we are assuming ai doesn’t depend on t
22/76
Panel Data and Omitted Variables
An example:
Leverageit = β0 + β1 Profitit + γ ai + eit Some potential (fixed) omitted variables
Manager skill or risk aversion Cost of capital
23/76
Panel Data and Omitted Variables
Suppose we are unable to observe ai yit=β0+β1xit+ vit
γ ai +eit If we estimate this regression, will we recover
No! because
βols =β 11
corr(xit,ai) ̸= 0 ⇒ corr(xit,vit) ̸= 0
Aside: Regression of this form are often called “pooled”
Because they “pool” data across individuals and time periods
24/76
Panel Data and Omitted Variables
βOLS +βOLSX 01
β0 + β1X
X
25/76
Y
Our first Mentis…
Load the data panel example.csv
What is the coefficient βˆols if we treat ai as unobserved?
regression
1
yit =β0+β1xit+vit
What is the coefficient βˆols if we observe and include ai in the 1
yit =β0+β1xit+γai+eit
26/76
First Difference Regression
yit=β0+β1xit+ vit
γ ai +eit
Suppose we see exactly two time periods t = {1, 2} for each i We can write our two time periods as:
yi,1 = β0 +β1xi,1 +γai +ei,1
yi,2 = β0 +β1xi,2 +γai +ei,2 Then take the difference:
Or
yi,2 −yi,1 = β1(xi,2 −xi,1)+(ei,2 −ei,1) ∆yi,2−1 = β1(∆xi,2−1)+∆ei,2−1
27/76
First Difference Regression
Instead of regressing yit on xit , regress the change in yit on the change in xit
Taking changes (differences) gets rid of fixed omitted variables ∆yi,2−1 = β1∆xi,2−1 +∆ei,2−1
As long as ∆ei,2−1 is mean independent of ∆xi,2−1:
E[∆ei,2−1|∆xi,2−1] = E[∆ei,2−1]
Note that this is not the same as:
E[eit|xit] = E[eit]
Menti: What is the coefficient βˆFD from a first difference regression? 1
28/76
Fixed Effects Regression
yit =β0+β1xit+γai+eit
An alternative approach:
Lets define δi = γai and rewrite:
yit =β0+β1xit+δi+eit So yi is determined by
(i) The baseline intercept β0 (ii) The effect of xi
(iii) An individual specific change in the intercept: δi Intuition behind fixed effects: Lets just estimate δi
29/76
What is δi
yit =β0+β1xit+δi+eit
δi is often referred to as i’s “fixed effect”
E[yit|xit = 0] = β0 +E[β1 ·0]+δi +E[eit|xit = 0]
So δi is just the change in individual is intercept: δi = E[yit|xit = 0]−β0
30/76
Fixed Effects Regression: Estimating δi
y1t =β0+β1xit+δ1+eit y2t =β0+β1xit+δ2+eit
.
ynt =β0+β1xit+δn+eit
How do we estimate δ1,δ2,··· ,δn?
31/76
Fixed Effects Regression: Estimating δi
yit =β0+β1xit+δi+eit
Simplest approach (to me): Dummy variables
Construct N-1 dummy variables D1,D2,··· ,DN−1
D1 =1 when i =1 and 0 otherwise D2 =1 when i =2 and 0 otherwise D3 =1 when i =3 and 0 otherwise And so on…
DN−1 =1 when i =N−1 and 0 otherwise
32/76
Fixed Effects Regression: Implementation
N−1
yit = β0 +β1xit + ∑ δiDi +eit
i=1
Note that we’ve left out DN
βOLS is interpreted as the intercept for individual N:
βOLS=E[y|x =0,i=N] 0 itit
0
and for all other i (e.g. i=2)
δ2 = E[yi|xit = 0,i = 2]−β0
Menti: What is the coefficient βˆFE from a fixed effects regression? 1
33/76
Fixed Effects Regression: Intuition
Any fixed characteristic of i is captured by the average of yit (for i)
By using dummy variables for i, we can just estimate (and hence
account for) those averages.
No longer have to worry about xit being correlated with a fixed component of eit
34/76
Why is This? Recall Regression Anatomy
βOLS = Cov(yit,x ̃it) 1 Var (x ̃it )
Where x ̃it is the residual from a regression of xit on Di N
xit = α0 + ∑αjDj +εit j=1
x ̃ =x −(αOLS+αOLS) it it 0 i
Subtracting (partialling out) the average xit for each i x ̃it is no longer correlated with eit
35/76
Fixed Effects Regression: Assumptions
There is one important difference in the assumptions necessary for OLS to capture the causal effect:
Before, we needed Now, we need:
E[eit|xit] = E[eit] E[eit|xi1,xi2,··· ,xiT ] = E[eit]
36/76
When Will Fixed Effects Not Be Enough?
We need
E[eit|xi1,xi2,··· ,xiT ] = E[eit]
But what if eit is growing over time?
E.g. interest rates rising each quarter, influencing profits and leverage
37/76
Time Fixed Effects
We so far have focused on controlling for entity i fixed effects
What if xit is correlated with something that changes over time but
is fixed across individual units?
Leverageit = β0 + β1 Profitsit + τt + vit
For example, many time-varying macro variables (e.g. monetary policy) might affect profits and leverage
If these are constant for all firms than they will be captured by τt
38/76
Time Fixed Effects
yit =β0+β1xit+τt+eit
Exact same approach as with entity fixed effects
Construct T −1 dummy variables D1,D2,··· ,DT−1
D1 =1 when t =1 and 0 otherwise D2 =1 when t =2 and 0 otherwise And so on…
And then, omitting one time period, we can estimate T−1
What is β0? τt?
yit = β0 +β1xit + ∑ τtDt +eit t=1
39/76
Time Fixed Effects
Time fixed effects do not deal with fixed individual characteristics What about combining both approaches?
40/76
Part 3: Difference-in-Difference
An example: Bankruptcy Costs and Leverage The difference-in-difference framework
Key assumption: Parallel Trends
41/76
Example: Bankruptcy Costs and Leverage
What is the effect of a decline in bankrutpcy costs on leverage?
Theory: Lower expected bankruptcy costs should increase leverage
Ideal (impossible to conduct) experiment: Randomly select a subset of firms
Reduce bankruptcy costs for these firms (e.g. streamline bankruptcy procedures)
Compare leverage between this subset and the remaining firms
42/76
Example: Bankruptcy Costs and Leverage
At the end of 1991 the state of Delaware passed a new law (“the reform”)
Significantly streamlined bankruptcy proceedings Reduced costs and time of litigation
Can we use this to learn something about our question? Suppose we call the causal effect of the reform: β1
How do we recover this parameter?
43/76
Approach 1: Before vs. After
Compare the average leverage of Delaware firms in 1991 vs. 1992 Let Aftert be a dummy equal to 1 after the reform
We would like to describe the relationship between the reform and leverage as:
Leverageit = β0 + β1 Aftert + vit
Where vit contains all other time and firm specific factors that influence leverage
44/76
Approach 1: Before vs. After
Suppose we regress Leverageit on our Aftert dummy: What is βOLS?
βOLS =E[Leverage |After =1]−E[Leverage |After =0] 1 it t it t
= β1 +E[vit|Aftert = 1]−E[vit|Aftert = 0] So β OLS = β1 (the causal effect of treatment) if
1
Why might that fail?
E[vit|Aftert]=E[vit]
1
45/76
Before vs. After
Leverage
E[Y|After=0]
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
E[Y|After=1]
46/76
When is Before vs. After Ineffective?
Leverage
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
46/76
When is Before vs. After Ineffective?
Leverage
E[Y|After=0]
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
E[Y|After=1]
46/76
Approach 1: Before vs. After
βOLS is just the difference in leverage for 1992 Delaware firms 1
(“treatment”) relative to 1991 Delaware firms (“Control”)
We require E [vit |Aftert = 1] = E [vit |Aftert = 0] for this to identify the causal effect of the reform
Any time trend/other events in 1992 will cause vit for later observations to be different from vit for earlier observations
e.g. tight credit in 1992 may have reduced debt (and hence leverage)
47/76
Approach 2: Cross Sectional
Compare Delaware Firms (“Treatment”) vs. Non-Delaware firms (Control) in 1992
Don’t need to worry about time trends
Requires data from firms in surrounding states
Let Di be a dummy equal to 1 if firm i is registered in Delaware
We would like to describle the relationship between the reform and leverage as:
Leveragei =β0+β1Di+vi
Where vi contains all other time and firm specific factors that influence leverage
48/76
Approach 2: Cross Sectional
Suppose we regress Leveragei on our Di dummy:
βOLS = E[Leverage |D = 1]−E[Leverage |D = 0]
1iiii = β1 +E[vi|Di = 1]−E[vi|Di = 0]
So β OLS = β1 (the causal effect of treatment) if 1
E[vi|Di] = E[vi]
Do we expect everything else that impacts leverage to be the same in Delaware and other states?
49/76
When is Cross Sectional Approach Ineffective?
Do we expect everything else that impacts leverage to be the same in Delaware and other states?
What if firms in Delaware are more capital-intensive Typically capital intensivity ⇒ more leverage
This is just an omitted variable:
Leveragei = β0 + β1Di + β2CIi + ei
So if we omit CIi and estimate
Leveragei =β0+β1Di+vi
Will βOLS be larger or smaller than β1? 1
50/76
When is Cross Sectional Approach Ineffective?
Of course, we could measure and control for capital intensivity Leveragei = β0 + β1Di + β2CIi + ei
Then our the assumption for β OLS = β1 becomes: 1
E[ei|Di,CIi] = E[ei|CIi]
Beyond capital intensivity, do we expect everything else that
impacts leverage to be the same in Delaware and other states?
Hard to control for everything
51/76
Difference-in-Difference Approach
Let’s combine the positive features of the cross-sectional and before/after approaches
Cross sectional avoided omitted trends
Before/after avoided omitted (fixed) characteristics
The difference-in-difference estimator does exactly this Leverageit = β0 + β1Di × Aftert + β2Di + β3Aftert + vit
Here β1 is the causal effect of the reform in Delaware
Requires data on firms in/out of Delaware before/after the reform
52/76
What Does Data Look Like for Difference-in-Difference
State Delaware Maryland Virginia Delaware Virginia Virginia Delaware Maryland Virginia
. .
Year Leverageit (D/E) Di Aftert 1991 1.2 1 0 1991 3.1 0 0 1991 1.9 0 0 1991 0.9 1 0 1991 1.5 0 0 1991 1.1 0 0 1991 1.2 1 0 1991 1.6 0 0 1991 0.5 0 0
. . .. . . ..
Di ×Aftert 0
0
0
0
0
0
0
0
0
0 1 0 1 0 1
Maryland 1992 Delaware 1992 Virginia 1992 Delaware 1992 Maryland 1992 Delaware 1992
0.8 0 1 0.9 1 1 1.6 0 1 2.2 1 1 1.4 0 1 1.9 1 1
53/76
What Do the Difference-in-Difference Estimates Capture?
Recall that when righthand side variables take discrete values, OLS perfectly captures the conditional expectation function:
E[Leverageit|Di,Aftert]=E[βOLS +βOLSDi ×Aftert +βOLSDi +βOLSAftert|Di,Aftert] 0123
There are four groups:
1. Non-Delaware Before: {Di = 0, Aftert = 0}
2. Delaware Before: {Di = 1, Aftert = 0}
3. Non-Delaware After: {Di = 0, Aftert = 1} 4. Delaware After: {Di = 1, Aftert = 1}
54/76
What Do the Difference-in-Difference estimates Capture?
Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS
2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02
3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03
4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123
0
55/76
What Do the Difference-in-Difference estimates Capture?
Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS
2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02
3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03
4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123
0
55/76
Diff-in-Diff Solves Issues with Cross-Sectional Approach
Cross Sectional: Compare averages In Delaware vs. outside, after the reform
E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =0,Aftert =1]
βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 03
Cross-sectional Difference After
= β OLS + β OLS 12
We worried about the possibility of some omitted difference between Delaware and other states (β OLS ̸= 0)
Solution: Use the pre-reform difference to account for any fixed differences
E[Leverageit|Di =1,Aftert =0]−E[Leverageit|Di =0,Aftert =0]
βOLS+βOLS βOLS 020
Cross-sectional Difference Before
=βOLS 2
2
56/76
Diff-in-Diff Solves Issues with Cross Sectional Approach
Difference in Difference=
Difference After−Difference Before
βOLS+βOLS βOLS 122
=βOLS 1
57/76
Diff-in-Diff Solves Issues with Before vs. After
Before vs After: Compare averages before vs. after within Delaware: E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =1,Aftert =0]
βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 02
Difference In Delaware
= β OLS + β OLS 13
We worried about the possibility of some time trend Solution: Use other states to account for time trends
E[Leverageit|Di =0,Aftert =1]−E[Leverageit|Di =0,Aftert =0]
βOLS+βOLS βOLS 030
Difference Out of Delaware
=βOLS 3
58/76
Diff-in-Diff Solves Issues with Before vs. After
Difference in Difference=
Difference In Delaware−Difference Out of Delaware
βOLS+βOLS βOLS 133
=βOLS 1
59/76
Difference in Difference Matrix
Two ways to interpret the same estimator βOLS : 1
Delaware (Treatment) Other States (Control) Difference
Before After Difference βOLS +βOLS βOLS +βOLS +βOLS +βOLS =βOLS +βOLS
02012313
βOLS βOLS +βOLS =βOLS 0033
= βOLS = βOLS +βOLS = βOLS 2121
60/76
Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
Control (Non−Delaware)
Before After
Month (t)
61/76
Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
Control (Non−Delaware) β OLS
Before 0 After Month (t)
62/76
Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
β OLS 2
Control (Non−Delaware) β OLS
Before 0 After Month (t)
63/76
Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
β OLS β OLS 23
Control (Non−Delaware) β OLS
Before 0 After Month (t)
64/76
Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
β OLS β OLS 23
Control (Non−Delaware) β OLS
Before 0 After Month (t)
β OLS 1
65/76
Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
Control (Non−Delaware)
β OLS 1
β OLS 2
β OLS 3
β OLS 0
Before After
Month (t)
66/76
When Does Diff-in-Diff Identify A Causal Effect
As usual, we need
E[vit|Di,Aftert] = E[vit]
What does this mean intuitively?
Parallel trends assumption: In the absence of any reform the
average change in leverage would have been the same in the treatment and control groups
In other words: trends in both groups are similar
67/76
Parallel Trends
Leverage
Treatment (Delaware)
β OLS β OLS 23
Control (Non−Delaware) β OLS
Before 0 After Month (t)
β OLS 1
68/76
Parallel Trends
Parallel trends does not require that there is no trend in leverage Just that it is the same between groups
Does not require that the levels be the same in the two groups What does it look like when the parallel trends assumption fails?
69/76
When Parallel Trends Fails
Leverage
Treatment (Delaware)
Control (Non−Delaware)
Before After
Month (t)
70/76
When Parallel Trends Fails
Leverage
Treatment (Delaware)
β OLS β OLS 32
Control (Non−Delaware) β OLS
Before 0 After Month (t)
β OLS 1
71/76
When Parallel Trends Fails
Treatment (Delaware)
OLS β3
Control (Non−Delaware)
β OLS 2
β OLS 0
Before After
Month (t)
β OLS 1
Leverage
72/76
Testing the Parallel Trends Assumption?
It is impossible to truly test
Assumption about what patterns would have been without treatment
However with data for several periods before the reform, can provide convincing evidence
Intution: show that the two groups have been parallel for a long time
Typically, plot the difference in means between treated and control groups
If the difference in means is flat ⇒ parallel trends more likely to hold
73/76
General Form of Diff-in-Diff
We are interested in the impact of some treatment on outcome Yi
Suppose we have a treated group and a control group
Let Di =1 be a dummy equal to 1 if i belongs to the treatment
group
And suppose we see both groups before and after the treatment occurs
Let Aftert = 1 be equal to 1 if time t is after the treatment date Yit =β0+β1Di×Aftert+β2Di+β3Aftert+vit
For more precision:
Yit = β0 +β1Di ×Aftert +δi +τt +vit
Where τt and δi are fixed effects for each time period and individual
74/76
Data Exercise
Load the d in d dataset
Perform the following regression
Leverageit = β0 + β1Di × Aftert + β2Di + β3Aftert + vit
Where Di = 1 in delaware and 0 otherwise
and Aftert = 1 in 1992
Menti: what is βˆOLS 1
If you complete this, estimate:
Yit = β0 +β1Di ×Aftert +δi +τt +vit
75/76
Overview
This class: estimating causal effects with panel data 1. An introduction to panel data
Multiple observations of the same unit over time 2. First difference and fixed effects estimators
Estimating causal effects with fixed omitted variables
3. Difference-in-difference estimators
A more robust method for estimating causal effects
76/76