Goals:
Introducing the data
The Application Notebook builds on the dataset that is kindly
provided in the hdm package. The data was used in Chernozhukov
and Hansen (2004). Their paper investigates the effect of
participation in the employer-sponsored 401(k) retirement savings plan
(p401) on net assets (net_tfa). Since then the
data was used to showcase many new methods. It is not the most
comprehensive dataset with basically ten
covariates/regressors/predictors:
age: age
 
db: defined benefit pension
 
educ: education (in years)
 
fsize: family size
 
hown: home owner
 
inc: income (in US $)
 
male: male
 
marr: married
 
pira: participation in individual retirement account
(IRA)
 
twoearn: two earners
 
However, it is publicly available and the relatively few covariates
ensure that the programs do not run too long.
# Load the packages required for later
library(hdm)
library(tidyverse)
set.seed(1234) # for replicability
options(scipen = 10) # Switch off scientific notation
# Get data
data(pension)
 
 
Handcoding OLS
Run the OLS regression \(net\_tfa = \beta_0
+ \beta_1 age + \beta_2 db + \beta_3 educ + \beta_4 fsize + \beta_5 hown
+ \beta_6 inc + \beta_7 male + \beta_8 marr + \beta_9 pira + \beta_{10}
twoearn + \epsilon\) using the base command:
ols = lm(data = pension,
          net_tfa ~ age + db + educ + fsize + hown + inc + male + marr + pira + twoearn )
summary(ols)
Call:
lm(formula = net_tfa ~ age + db + educ + fsize + hown + inc + 
    male + marr + pira + twoearn, data = pension)
Residuals:
    Min      1Q  Median      3Q     Max 
-509020  -17047   -2203   10232 1436184 
Coefficients:
                Estimate   Std. Error t value         Pr(>|t|)    
(Intercept) -31557.69509   4422.53582  -7.136 0.00000000000103 ***
age            610.20621     60.05623  10.161          < 2e-16 ***
db           -3517.95396   1331.34350  -2.642          0.00824 ** 
educ          -621.90195    228.96898  -2.716          0.00662 ** 
fsize        -1092.95457    456.73577  -2.393          0.01673 *  
hown          1619.33449   1322.02599   1.225          0.22065    
inc              0.96194      0.02989  32.186          < 2e-16 ***
male         -1141.88008   1517.82867  -0.752          0.45188    
marr           611.68490   1820.21834   0.336          0.73684    
pira         29662.86017   1467.19707  20.217          < 2e-16 ***
twoearn     -19112.13608   1579.31262 -12.102          < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 55790 on 9904 degrees of freedom
Multiple R-squared:  0.2295,    Adjusted R-squared:  0.2287 
F-statistic:   295 on 10 and 9904 DF,  p-value: < 2.2e-16
Now implement the point estimator manually using the matrix
representation of the parameters \(\boldsymbol{\hat{\beta}} =
\boldsymbol{(X'X)^{-1}X'Y}\):
X = model.matrix(data = pension,
          ~ age + db + educ + fsize + hown + inc + male + marr + pira + twoearn )
Y = pension[,"net_tfa"]
hand_ols = solve(t(X) %*% X) %*% t(X) %*% Y
hand_ols
                      [,1]
(Intercept) -31557.6950949
age            610.2062081
db           -3517.9539554
educ          -621.9019512
fsize        -1092.9545659
hown          1619.3344926
inc              0.9619398
male         -1141.8800757
marr           611.6848981
pira         29662.8601743
twoearn     -19112.1360850
The point estimates look very similar, but let`s check explicitly
whether they are equal:
all.equal(as.numeric(hand_ols),as.numeric(ols$coefficients))
[1] TRUE
 
 
Frisch-Waugh
The Frisch-Waugh
theorem tells us that we can get, e.g., \(\hat{\beta_1}\) also in the following
way
- Run regression \(age = \alpha_0 + \alpha_1
db + \alpha_2 educ + \alpha_3 fsize + \alpha_4 hown + \alpha_5 inc +
\alpha_6 male + \alpha_7 marr + \alpha_8 pira + \alpha_{9} twoearn +
V\)
 
ols_age = lm(data = pension,
          age ~ db + educ + fsize + hown + inc + male + marr + pira + twoearn )
Vhat = ols_age$residuals
- Run regression \(net\_tfa = \gamma_0 +
\gamma_1 db + \gamma_2 educ + \gamma_3 fsize + \gamma_4 hown + \gamma_5
inc + \gamma_6 male + \gamma_7 marr + \gamma_8 pira + \gamma_{9} twoearn
+ U\)
 
ols_net_tfa = lm(data = pension,
          net_tfa ~ db + educ + fsize + hown + inc + male + marr + pira + twoearn )
Uhat = ols_net_tfa$residuals
- Run residual-on-residual regression \(\hat{U} = \tau \hat{V} + \epsilon\)
 
summary(lm(Uhat ~ 0 + Vhat))
Call:
lm(formula = Uhat ~ 0 + Vhat)
Residuals:
    Min      1Q  Median      3Q     Max 
-509020  -17047   -2203   10232 1436184 
Coefficients:
     Estimate Std. Error t value Pr(>|t|)    
Vhat   610.21      60.03   10.17   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 55760 on 9914 degrees of freedom
Multiple R-squared:  0.01032,   Adjusted R-squared:  0.01022 
F-statistic: 103.3 on 1 and 9914 DF,  p-value: < 2.2e-16
This looks indeed very similar to the 610.2062081 estimated by the
lm() function, but let’s check it formally:
all.equal(as.numeric(lm(Uhat ~ 0 + Vhat)$coefficients),
          as.numeric(ols$coefficients[2]))
[1] TRUE
\(\Rightarrow\) The standard and the
manual Frisch-Waugh \(\hat{\beta}_1\)
are identical.
(optional): Check also the other parameters \(\hat{\beta}_2\) to \(\hat{\beta}_{10}\).
 
LS0tDQp0aXRsZTogIkJhc2ljczogT0xTIGFuZCBGcmlzY2gtV2F1Z2giDQpzdWJ0aXRsZTogIkFwcGxpY2F0aW9uIG5vdGVib29rIg0KYXV0aG9yOiAiTWljaGFlbCBLbmF1cyINCmRhdGU6ICJgciBmb3JtYXQoU3lzLnRpbWUoKSwgJyVtLyV5JylgIg0Kb3V0cHV0OiANCiAgaHRtbF9ub3RlYm9vazoNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCiAgICBjb2RlX2ZvbGRpbmc6IHNob3cNCi0tLQ0KDQoNCkdvYWxzOg0KDQotIEhhbmRjb2RlIE9MUw0KDQotIEhhbmRjb2RlIEZyaXNjaC1XYXVnaCBUaGVvcmVtDQoNCjxicj4NCg0KIyBJbnRyb2R1Y2luZyB0aGUgZGF0YQ0KDQpUaGUgQXBwbGljYXRpb24gTm90ZWJvb2sgYnVpbGRzIG9uIHRoZSBkYXRhc2V0IHRoYXQgaXMga2luZGx5IHByb3ZpZGVkIGluIHRoZSBgaGRtYCBwYWNrYWdlLiBUaGUgZGF0YSB3YXMgdXNlZCBpbiBbQ2hlcm5vemh1a292IGFuZCBIYW5zZW4gKDIwMDQpXShodHRwczovL2RpcmVjdC5taXQuZWR1L3Jlc3QvYXJ0aWNsZS84Ni8zLzczNS81NzU4Ni9UaGUtRWZmZWN0cy1vZi00MDEtSy1QYXJ0aWNpcGF0aW9uLW9uLXRoZS1XZWFsdGgpLiBUaGVpciBwYXBlciBpbnZlc3RpZ2F0ZXMgdGhlIGVmZmVjdCBvZiBwYXJ0aWNpcGF0aW9uIGluIHRoZSBlbXBsb3llci1zcG9uc29yZWQgNDAxKGspIHJldGlyZW1lbnQgc2F2aW5ncyBwbGFuIChgcDQwMWApIG9uIG5ldCBhc3NldHMgKGBuZXRfdGZhYCkuIFNpbmNlIHRoZW4gdGhlIGRhdGEgd2FzIHVzZWQgdG8gc2hvd2Nhc2UgbWFueSBuZXcgbWV0aG9kcy4gSXQgaXMgbm90IHRoZSBtb3N0IGNvbXByZWhlbnNpdmUgZGF0YXNldCB3aXRoIGJhc2ljYWxseSB0ZW4gY292YXJpYXRlcy9yZWdyZXNzb3JzL3ByZWRpY3RvcnM6DQoNCi0gKmFnZSo6IGFnZQ0KDQotICpkYio6IGRlZmluZWQgYmVuZWZpdCBwZW5zaW9uDQoNCi0gKmVkdWMqOiBlZHVjYXRpb24gKGluIHllYXJzKQ0KDQotICpmc2l6ZSo6IGZhbWlseSBzaXplDQoNCi0gKmhvd24qOiBob21lIG93bmVyDQoNCi0gKmluYyo6IGluY29tZSAoaW4gVVMgJCkNCg0KLSAqbWFsZSo6IG1hbGUNCg0KLSAqbWFycio6IG1hcnJpZWQNCg0KLSAqcGlyYSo6IHBhcnRpY2lwYXRpb24gaW4gaW5kaXZpZHVhbCByZXRpcmVtZW50IGFjY291bnQgKElSQSkNCg0KLSAqdHdvZWFybio6IHR3byBlYXJuZXJzDQoNCkhvd2V2ZXIsIGl0IGlzIHB1YmxpY2x5IGF2YWlsYWJsZSBhbmQgdGhlIHJlbGF0aXZlbHkgZmV3IGNvdmFyaWF0ZXMgZW5zdXJlIHRoYXQgdGhlIHByb2dyYW1zIGRvIG5vdCBydW4gdG9vIGxvbmcuDQoNCmBgYHtyLCB3YXJuaW5nID0gRiwgbWVzc2FnZSA9IEZ9DQojIExvYWQgdGhlIHBhY2thZ2VzIHJlcXVpcmVkIGZvciBsYXRlcg0KbGlicmFyeShoZG0pDQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCg0Kc2V0LnNlZWQoMTIzNCkgIyBmb3IgcmVwbGljYWJpbGl0eQ0Kb3B0aW9ucyhzY2lwZW4gPSAxMCkgIyBTd2l0Y2ggb2ZmIHNjaWVudGlmaWMgbm90YXRpb24NCg0KIyBHZXQgZGF0YQ0KZGF0YShwZW5zaW9uKQ0KYGBgDQoNCjxicj4NCjxicj4NCg0KDQojIEhhbmRjb2RpbmcgT0xTDQoNClJ1biB0aGUgT0xTIHJlZ3Jlc3Npb24gJG5ldFxfdGZhID0gXGJldGFfMCArIFxiZXRhXzEgYWdlICsgXGJldGFfMiBkYiArIFxiZXRhXzMgZWR1YyArIFxiZXRhXzQgZnNpemUgKyBcYmV0YV81IGhvd24gKyBcYmV0YV82IGluYyArIFxiZXRhXzcgbWFsZSArIFxiZXRhXzggbWFyciArIFxiZXRhXzkgcGlyYSArIFxiZXRhX3sxMH0gdHdvZWFybiArIFxlcHNpbG9uJCB1c2luZyB0aGUgYmFzZSBjb21tYW5kOg0KDQpgYGB7cn0NCm9scyA9IGxtKGRhdGEgPSBwZW5zaW9uLA0KICAgICAgICAgIG5ldF90ZmEgfiBhZ2UgKyBkYiArIGVkdWMgKyBmc2l6ZSArIGhvd24gKyBpbmMgKyBtYWxlICsgbWFyciArIHBpcmEgKyB0d29lYXJuICkNCnN1bW1hcnkob2xzKQ0KYGBgDQoNCk5vdyBpbXBsZW1lbnQgdGhlIHBvaW50IGVzdGltYXRvciBtYW51YWxseSB1c2luZyB0aGUgbWF0cml4IHJlcHJlc2VudGF0aW9uIG9mIHRoZSBwYXJhbWV0ZXJzICRcYm9sZHN5bWJvbHtcaGF0e1xiZXRhfX0gPSBcYm9sZHN5bWJvbHsoWCdYKV57LTF9WCdZfSQ6DQoNCmBgYHtyfQ0KWCA9IG1vZGVsLm1hdHJpeChkYXRhID0gcGVuc2lvbiwNCiAgICAgICAgICB+IGFnZSArIGRiICsgZWR1YyArIGZzaXplICsgaG93biArIGluYyArIG1hbGUgKyBtYXJyICsgcGlyYSArIHR3b2Vhcm4gKQ0KWSA9IHBlbnNpb25bLCJuZXRfdGZhIl0NCg0KaGFuZF9vbHMgPSBzb2x2ZSh0KFgpICUqJSBYKSAlKiUgdChYKSAlKiUgWQ0KDQpoYW5kX29scw0KYGBgDQoNClRoZSBwb2ludCBlc3RpbWF0ZXMgbG9vayB2ZXJ5IHNpbWlsYXIsIGJ1dCBsZXRgcyBjaGVjayBleHBsaWNpdGx5IHdoZXRoZXIgdGhleSBhcmUgZXF1YWw6DQoNCmBgYHtyfQ0KYWxsLmVxdWFsKGFzLm51bWVyaWMoaGFuZF9vbHMpLGFzLm51bWVyaWMob2xzJGNvZWZmaWNpZW50cykpDQoNCmBgYA0KDQo8YnI+DQo8YnI+DQoNCiMgRnJpc2NoLVdhdWdoDQoNClRoZSBbRnJpc2NoLVdhdWdoIHRoZW9yZW1dKGh0dHBzOi8vZW4ud2lraXBlZGlhLm9yZy93aWtpL0ZyaXNjaCVFMiU4MCU5M1dhdWdoJUUyJTgwJTkzTG92ZWxsX3RoZW9yZW0pIHRlbGxzIHVzIHRoYXQgd2UgY2FuIGdldCwgZS5nLiwgJFxoYXR7XGJldGFfMX0kIGFsc28gaW4gdGhlIGZvbGxvd2luZyB3YXkNCg0KMS4gUnVuIHJlZ3Jlc3Npb24gJGFnZSA9IFxhbHBoYV8wICsgXGFscGhhXzEgZGIgKyBcYWxwaGFfMiBlZHVjICsgXGFscGhhXzMgZnNpemUgKyBcYWxwaGFfNCBob3duICsgXGFscGhhXzUgaW5jICsgXGFscGhhXzYgbWFsZSArIFxhbHBoYV83IG1hcnIgKyBcYWxwaGFfOCBwaXJhICsgXGFscGhhX3s5fSB0d29lYXJuICsgViQNCg0KYGBge3J9DQpvbHNfYWdlID0gbG0oZGF0YSA9IHBlbnNpb24sDQogICAgICAgICAgYWdlIH4gZGIgKyBlZHVjICsgZnNpemUgKyBob3duICsgaW5jICsgbWFsZSArIG1hcnIgKyBwaXJhICsgdHdvZWFybiApDQpWaGF0ID0gb2xzX2FnZSRyZXNpZHVhbHMNCmBgYA0KDQoyLiBSdW4gcmVncmVzc2lvbiAkbmV0XF90ZmEgPSBcZ2FtbWFfMCArIFxnYW1tYV8xIGRiICsgXGdhbW1hXzIgZWR1YyArIFxnYW1tYV8zIGZzaXplICsgXGdhbW1hXzQgaG93biArIFxnYW1tYV81IGluYyArIFxnYW1tYV82IG1hbGUgKyBcZ2FtbWFfNyBtYXJyICsgXGdhbW1hXzggcGlyYSArIFxnYW1tYV97OX0gdHdvZWFybiArIFUkDQoNCmBgYHtyfQ0Kb2xzX25ldF90ZmEgPSBsbShkYXRhID0gcGVuc2lvbiwNCiAgICAgICAgICBuZXRfdGZhIH4gZGIgKyBlZHVjICsgZnNpemUgKyBob3duICsgaW5jICsgbWFsZSArIG1hcnIgKyBwaXJhICsgdHdvZWFybiApDQpVaGF0ID0gb2xzX25ldF90ZmEkcmVzaWR1YWxzDQpgYGANCg0KMy4gUnVuIHJlc2lkdWFsLW9uLXJlc2lkdWFsIHJlZ3Jlc3Npb24gJFxoYXR7VX0gPSBcdGF1IFxoYXR7Vn0gKyBcZXBzaWxvbiQNCg0KYGBge3J9DQpzdW1tYXJ5KGxtKFVoYXQgfiAwICsgVmhhdCkpDQpgYGANCg0KVGhpcyBsb29rcyBpbmRlZWQgdmVyeSBzaW1pbGFyIHRvIHRoZSBgciBvbHMkY29lZmZpY2llbnRzWzJdYCBlc3RpbWF0ZWQgYnkgdGhlIGBsbSgpYCBmdW5jdGlvbiwgYnV0IGxldCdzIGNoZWNrIGl0IGZvcm1hbGx5Og0KDQpgYGB7cn0NCmFsbC5lcXVhbChhcy5udW1lcmljKGxtKFVoYXQgfiAwICsgVmhhdCkkY29lZmZpY2llbnRzKSwNCiAgICAgICAgICBhcy5udW1lcmljKG9scyRjb2VmZmljaWVudHNbMl0pKQ0KYGBgDQoNCiRcUmlnaHRhcnJvdyQgVGhlIHN0YW5kYXJkIGFuZCB0aGUgbWFudWFsIEZyaXNjaC1XYXVnaCAkXGhhdHtcYmV0YX1fMSQgYXJlIGlkZW50aWNhbC4NCg0KPGJyPg0KDQoqKG9wdGlvbmFsKSo6IENoZWNrIGFsc28gdGhlIG90aGVyIHBhcmFtZXRlcnMgJFxoYXR7XGJldGF9XzIkIHRvICRcaGF0e1xiZXRhfV97MTB9JC4gDQoNCjxicj4NCg0KDQoNCg0KDQoNCg==