This Notebook replicates the main results of the paper. The notebook should help to show the underlying code for researchers interested in replicating the analysis using the same data or adapting it to new datasets.
The explanations throughout the notebook are kept short. Please see the paper for more explanations regarding the parameters of interest and their Double Machine Learning based estimation.
It is indicated in bold if a chunk reproduces a Table or graph of the paper.
Click on the “Code” button on the top right of this notebook to “Hide All Code” and see only the results or “Download Rmd” to extract the underlying code.
Running the analyses in this notebook took roughly one week on a SWITCHengine with eight cores and 32 GB RAM (bottlenecks are the kernel regressions and the depth three policy trees).
Note that the results might differ for different operating systems. These results were obtained under Windows Server 2012 R2 Standard and are known to not necessarily replicate under different operating systems.
The analysis is mostly based on the causalDML R-package that was developed in parallel to the paper. The preparation of the “Database_SALMP_MCK2022.Rdata” that is loaded can be replicated by requesting the data via FORSbase and running the data preparation file.
## Load relevant packages, in case you did not yet install causalDML uncomment the next two lines
# library(devtools)
# install_github(repo="MCKnaus/causalDML")
library(causalDML) # version 0.1.0
library(DiagrammeRsvg) # version 0.1
library(estimatr) # version 0.30.4
library(grf) # version 2.0.2
library(matrixStats) # version 0.61.0
library(plyr) # version 1.8.6
library(policytree) # version 1.1.1
library(tidyverse) # version 1.3.1
## Define and set seed
seed = 1234
set.seed(seed)
## Set to path where data are stored for replication
setwd("C:/Users/Administrator/switchdrive/Papers/Multi Treat Labor/Paper/Data")
load("Database_SALMP_MCK2022.Rdata")
The DML_aipw command estimates first the cross-fitted nuisance parameters and uses them to calculate the double robust scores for APO and ATE, which are then used to estimate the APO and ATE(T).
# Create the ML method to be used for nuisance parameter prediction
forest = create_method("forest_grf",args=list(tune.parameters = "all",seed=seed))
# Run the main function that outputs nuisance parameters, APO and ATE
cDML = DML_aipw(y,w,x,ml_w=list(forest),ml_y=list(forest))
The following table and plot shows the DML estimated APOs.
summary(cDML$APO)
APO SE
no program 14.78249 0.05605243
job search 13.76352 0.11530332
vocational 18.04317 0.42324454
computer 18.21504 0.42531991
language 17.31753 0.37224347
plot(cDML$APO,label_w)
The following table shows the results of all possible ATEs given the five different treatments:
summary(cDML$ATE)
ATE SE t p
job search - no program -1.01897 0.12568 -8.1079 5.243e-16 ***
vocational - no program 3.26068 0.42627 7.6493 2.049e-14 ***
computer - no program 3.43255 0.42834 8.0137 1.132e-15 ***
language - no program 2.53504 0.37519 6.7566 1.425e-11 ***
vocational - job search 4.27965 0.43819 9.7666 < 2.2e-16 ***
computer - job search 4.45152 0.44018 10.1129 < 2.2e-16 ***
language - job search 3.55401 0.38876 9.1419 < 2.2e-16 ***
computer - vocational 0.17187 0.59980 0.2865 0.7745
language - vocational -0.72564 0.56313 -1.2886 0.1976
language - computer -0.89751 0.56463 -1.5896 0.1119
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The following tables show the results for all possible ATETs given the five different treatments. The first line shows for example an effect of -1.045 for being in a job search program vs. no program for those who were actually observed in no program, or formally adapting the notation of the paper \(E[Y(jo b~search) - Y(no~program) \mid W = no~program]\). As we can condition on all five programs, we get five tables:
# Estimate ATETs for the programs
atets = matrix(NA,4,2)
for (i in 1:5) {
print(label_w[i])
APO_atet = APO_dml_atet(y,cDML$APO$m_mat,cDML$APO$w_mat,cDML$APO$e_mat,cDML$APO$cf_mat,treated=i)
ATET = ATE_dml(APO_atet)
summary(ATET)
atets[i-1,] = ATET$results[i-1,1:2]
}
[1] "No program"
ATET SE t p
job search - no program -1.037448 0.131322 -7.9000 2.834e-15 ***
vocational - no program 3.312152 0.426211 7.7711 7.896e-15 ***
computer - no program 3.398175 0.430738 7.8892 3.090e-15 ***
language - no program 2.511750 0.380653 6.5985 4.186e-11 ***
vocational - job search 4.349601 0.439621 9.8940 < 2.2e-16 ***
computer - job search 4.435623 0.444012 9.9899 < 2.2e-16 ***
language - job search 3.549198 0.395741 8.9685 < 2.2e-16 ***
computer - vocational 0.086023 0.600992 0.1431 0.8862
language - vocational -0.800403 0.566550 -1.4128 0.1577
language - computer -0.886425 0.569881 -1.5555 0.1198
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
[1] "Job search"
ATET SE t p
job search - no program -0.97789 0.12138 -8.0564 7.993e-16 ***
vocational - no program 3.00435 0.47968 6.2632 3.795e-10 ***
computer - no program 3.57931 0.48300 7.4106 1.273e-13 ***
language - no program 2.74899 0.42117 6.5270 6.759e-11 ***
vocational - job search 3.98224 0.48678 8.1808 2.872e-16 ***
computer - job search 4.55720 0.49005 9.2995 < 2.2e-16 ***
language - job search 3.72688 0.42914 8.6846 < 2.2e-16 ***
computer - vocational 0.57497 0.67270 0.8547 0.3927
language - vocational -0.25536 0.63092 -0.4047 0.6857
language - computer -0.83032 0.63316 -1.3114 0.1897
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
[1] "Vocational"
ATET SE t p
job search - no program -1.42027 0.15621 -9.0918 < 2.2e-16 ***
vocational - no program 2.79143 0.39170 7.1264 1.041e-12 ***
computer - no program 2.64829 0.46650 5.6769 1.377e-08 ***
language - no program 2.95231 0.40138 7.3554 1.927e-13 ***
vocational - job search 4.21170 0.42335 9.9485 < 2.2e-16 ***
computer - job search 4.06856 0.48615 8.3689 < 2.2e-16 ***
language - job search 4.37257 0.42634 10.2562 < 2.2e-16 ***
computer - vocational -0.14314 0.58648 -0.2441 0.8072
language - vocational 0.16088 0.54246 0.2966 0.7668
language - computer 0.30402 0.59065 0.5147 0.6068
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
[1] "Computer"
ATET SE t p
job search - no program -1.23291 0.15704 -7.8509 4.195e-15 ***
vocational - no program 3.08937 0.47603 6.4899 8.655e-11 ***
computer - no program 3.47345 0.39980 8.6879 < 2.2e-16 ***
language - no program 3.32714 0.48171 6.9070 4.997e-12 ***
vocational - job search 4.32228 0.49537 8.7253 < 2.2e-16 ***
computer - job search 4.70636 0.42929 10.9632 < 2.2e-16 ***
language - job search 4.56005 0.50249 9.0749 < 2.2e-16 ***
computer - vocational 0.38408 0.59403 0.6466 0.5179
language - vocational 0.23778 0.65534 0.3628 0.7167
language - computer -0.14631 0.59892 -0.2443 0.8070
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
[1] "Language"
ATET SE t p
job search - no program -0.552369 0.193220 -2.8588 0.0042544 **
vocational - no program 3.966546 0.838263 4.7319 2.230e-06 ***
computer - no program 3.952364 0.712957 5.5436 2.975e-08 ***
language - no program 1.083502 0.292444 3.7050 0.0002116 ***
vocational - job search 4.518916 0.849571 5.3191 1.047e-07 ***
computer - job search 4.504733 0.725997 6.2049 5.508e-10 ***
language - job search 1.635872 0.337388 4.8486 1.246e-06 ***
computer - vocational -0.014182 1.076168 -0.0132 0.9894853
language - vocational -2.883044 0.881037 -3.2723 0.0010672 **
language - computer -2.868862 0.761917 -3.7653 0.0001665 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The numbers above are condensed into Figure 2 in the paper:
# Plot ATE and ATET together
ates = cDML$ATE$results[1:4,1:2]
df = data.frame(Estimand = c(rep("ATE",4),rep("ATET",4)),
program = factor(rep(label_w[-1],2),label_w[-1]),
effect = c(ates[,1],atets[,1]))
df$low = df$effect - 1.96 * c(ates[,2],atets[,2])
df$up = df$effect + 1.96 * c(ates[,2],atets[,2])
ggplot(data = df,mapping = aes(x = program, y = effect,ymin = low,
ymax = up,group = Estimand, linetype = Estimand)) +
geom_point(size=2,position = position_dodge(width = 0.4)) +
geom_errorbar(width=0.2,position = position_dodge(width = 0.4)) +
theme_bw() + theme(axis.text.x = element_text(angle = 45, hjust = 1),axis.title.x=element_blank()) +
ylab("Average treatment effects") + geom_hline(yintercept = 0) +
theme(text=element_text(family="serif",size = 14, colour="black"))
The output of the DML_aipw command contains an APO_dml object and an ATE_dml object that carry a gamma and delta matrix storing the doubly robust APO and ATE scores, respectively. The delta matrix of the ATE_dml object can be used to conveniently estimate heterogeneous effects.
Group average treatment effects (GATEs) represent the average effects in different subgroups. Such subgroup analyses are an integral part of most program evaluations and usually require to rerun the preferred method in subsamples. Using the doubly robust ATE score as pseudo outcome in a standard OLS regression and a group indicator as explanatory variable estimates GATEs.
The results shown in the following are very similar to those presented in Table 4 of the paper:
Table 4 Panel A:
# GATEs female
for (i in 1:4) {
cat(paste("\n\n",label_w[i+1]),"\n")
temp_ols = lm_robust(cDML$ATE$delta[,i] ~ z_blp$female)
print(summary(temp_ols))
}
Job search
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$female)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) -1.2852 0.166 -7.743 9.863e-15 -1.6106 -0.9599 62495
z_blp$female 0.6038 0.254 2.378 1.742e-02 0.1061 1.1016 62495
Multiple R-squared: 9.106e-05 , Adjusted R-squared: 7.506e-05
F-statistic: 5.654 on 1 and 62495 DF, p-value: 0.01742
Vocational
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$female)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 3.824 0.5512 6.937 4.043e-12 2.743 4.9039 62495
z_blp$female -1.277 0.8660 -1.474 1.404e-01 -2.974 0.4208 62495
Multiple R-squared: 3.538e-05 , Adjusted R-squared: 1.938e-05
F-statistic: 2.173 on 1 and 62495 DF, p-value: 0.1404
Computer
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$female)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 2.334 0.6004 3.888 0.0001012 1.1575 3.511 62495
z_blp$female 2.491 0.8512 2.926 0.0034308 0.8226 4.159 62495
Multiple R-squared: 0.0001334 , Adjusted R-squared: 0.0001174
F-statistic: 8.564 on 1 and 62495 DF, p-value: 0.003431
Language
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$female)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 3.402 0.4564 7.455 9.098e-14 2.508 4.297 62495
z_blp$female -1.967 0.7729 -2.545 1.094e-02 -3.482 -0.452 62495
Multiple R-squared: 0.0001084 , Adjusted R-squared: 9.241e-05
F-statistic: 6.476 on 1 and 62495 DF, p-value: 0.01094
Table 4 Panel B:
# GATEs foreigner
for (i in 1:4) {
cat(paste("\n\n",label_w[i+1]),":\n")
temp_ols = lm_robust(cDML$ATE$delta[,i] ~ z_blp$foreigner)
print(summary(temp_ols))
}
Job search :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$foreigner)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) -1.2720 0.1598 -7.960 1.751e-15 -1.5852 -0.9588 62495
z_blp$foreigner 0.6989 0.2583 2.705 6.823e-03 0.1926 1.2052 62495
Multiple R-squared: 0.0001143 , Adjusted R-squared: 9.83e-05
F-statistic: 7.32 on 1 and 62495 DF, p-value: 0.006823
Vocational :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$foreigner)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 2.476 0.5257 4.709 2.491e-06 1.4454 3.506 62495
z_blp$foreigner 2.168 0.8969 2.417 1.565e-02 0.4098 3.926 62495
Multiple R-squared: 9.557e-05 , Adjusted R-squared: 7.957e-05
F-statistic: 5.842 on 1 and 62495 DF, p-value: 0.01565
Computer :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$foreigner)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 3.7508 0.5042 7.4392 1.026e-13 2.763 4.7390 62495
z_blp$foreigner -0.8789 0.9299 -0.9452 3.446e-01 -2.701 0.9436 62495
Multiple R-squared: 1.556e-05 , Adjusted R-squared: -4.403e-07
F-statistic: 0.8934 on 1 and 62495 DF, p-value: 0.3446
Language :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$foreigner)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 3.564 0.5157 6.910 4.881e-12 2.553 4.574 62495
z_blp$foreigner -2.841 0.7169 -3.963 7.424e-05 -4.246 -1.436 62495
Multiple R-squared: 0.0002119 , Adjusted R-squared: 0.0001959
F-statistic: 15.7 on 1 and 62495 DF, p-value: 7.424e-05
Table 4 Panel C:
# GATEs employability
for (i in 1:4) {
cat(paste("\n\n",label_w[i+1]),":\n")
temp_ols = lm_robust(cDML$ATE$delta[,i] ~ z_blp$employability)
print(summary(temp_ols))
}
Job search :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$employability)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) -0.1791 0.3303 -0.5422 0.587686 -0.8264 0.4683 62494
z_blp$employability2 -0.9344 0.3609 -2.5889 0.009631 -1.6419 -0.2270 62494
z_blp$employability3 -1.5043 0.4960 -3.0327 0.002425 -2.4765 -0.5321 62494
Multiple R-squared: 0.0001624 , Adjusted R-squared: 0.0001304
F-statistic: 5.035 on 2 and 62494 DF, p-value: 0.006506
Vocational :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$employability)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 5.479 1.040 5.268 1.380e-07 3.441 7.5176 62494
z_blp$employability2 -2.412 1.157 -2.084 3.714e-02 -4.680 -0.1437 62494
z_blp$employability3 -4.420 1.517 -2.914 3.575e-03 -7.394 -1.4467 62494
Multiple R-squared: 0.0001116 , Adjusted R-squared: 7.957e-05
F-statistic: 4.312 on 2 and 62494 DF, p-value: 0.01341
Computer :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$employability)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 5.763 1.097 5.254 1.497e-07 3.613 7.9127 62494
z_blp$employability2 -2.666 1.205 -2.213 2.689e-02 -5.027 -0.3050 62494
z_blp$employability3 -3.590 1.692 -2.122 3.385e-02 -6.906 -0.2739 62494
Multiple R-squared: 9.56e-05 , Adjusted R-squared: 6.36e-05
F-statistic: 2.959 on 2 and 62494 DF, p-value: 0.05186
Language :
Call:
lm_robust(formula = cDML$ATE$delta[, i] ~ z_blp$employability)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 2.6120 0.8533 3.0612 0.002205 0.9396 4.284 62494
z_blp$employability2 -0.1781 0.9612 -0.1853 0.853023 -2.0621 1.706 62494
z_blp$employability3 0.5936 1.4845 0.3999 0.689250 -2.3160 3.503 62494
Multiple R-squared: 5.801e-06 , Adjusted R-squared: -2.62e-05
F-statistic: 0.1824 on 2 and 62494 DF, p-value: 0.8333
We can also model effect heterogeneity in a multivariate OLS. Again the ATE score is reused as a pseudo-outcome in a plain vanilla OLS regression with heteroscedasticity robust standard errors. This replicates the results of Table 5:
## BLP CATEs
for (i in 1:4) {
cat(paste("\n\n",label_w[i+1]),":\n")
temp_df = data.frame(y = cDML$ATE$delta[,i])
temp_df = cbind(temp_df,z_blp)
temp_ols = lm_robust(as.formula("y ~ ."),data = temp_df)
print(summary(temp_ols))
}
Job search :
Call:
lm_robust(formula = as.formula("y ~ ."), data = temp_df)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) -0.47643 0.70417 -0.6766 4.987e-01 -1.856597 0.90375 62490
female 0.25428 0.26765 0.9501 3.421e-01 -0.270306 0.77887 62490
age 0.02325 0.01491 1.5595 1.189e-01 -0.005972 0.05248 62490
foreigner 0.49739 0.26543 1.8739 6.095e-02 -0.022865 1.01764 62490
employability2 -0.65188 0.37035 -1.7602 7.838e-02 -1.377768 0.07400 62490
employability3 -1.03371 0.51072 -2.0240 4.297e-02 -2.034709 -0.03270 62490
past_incomein10000 -0.25548 0.06438 -3.9683 7.247e-05 -0.381670 -0.12930 62490
Multiple R-squared: 0.0006031 , Adjusted R-squared: 0.0005072
F-statistic: 6.953 on 6 and 62490 DF, p-value: 2.102e-07
Vocational :
Call:
lm_robust(formula = as.formula("y ~ ."), data = temp_df)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 4.46034 2.39936 1.859 0.063036 -0.242417 9.16310 62490
female -2.11727 0.92022 -2.301 0.021404 -3.920898 -0.31364 62490
age 0.09153 0.04926 1.858 0.063145 -0.005014 0.18808 62490
foreigner 1.60120 0.90536 1.769 0.076968 -0.173300 3.37570 62490
employability2 -1.63636 1.17868 -1.388 0.165050 -3.946577 0.67385 62490
employability3 -3.12839 1.54849 -2.020 0.043358 -6.163437 -0.09334 62490
past_incomein10000 -0.62267 0.22785 -2.733 0.006281 -1.069249 -0.17609 62490
Multiple R-squared: 0.0003749 , Adjusted R-squared: 0.0002789
F-statistic: 4.122 on 6 and 62490 DF, p-value: 0.0003831
Computer :
Call:
lm_robust(formula = as.formula("y ~ ."), data = temp_df)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 4.93121 2.36503 2.085 0.03707 0.29576 9.56667 62490
female 1.89452 0.89987 2.105 0.03527 0.13077 3.65827 62490
age 0.05034 0.04903 1.027 0.30456 -0.04576 0.14644 62490
foreigner -1.20122 0.94917 -1.266 0.20568 -3.06158 0.65915 62490
employability2 -2.36445 1.21962 -1.939 0.05255 -4.75491 0.02601 62490
employability3 -3.15091 1.71121 -1.841 0.06558 -6.50488 0.20305 62490
past_incomein10000 -0.38986 0.18630 -2.093 0.03638 -0.75501 -0.02471 62490
Multiple R-squared: 0.0003031 , Adjusted R-squared: 0.0002071
F-statistic: 3.347 on 6 and 62490 DF, p-value: 0.002678
Language :
Call:
lm_robust(formula = as.formula("y ~ ."), data = temp_df)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 6.06589 2.09297 2.8982 0.0037542 1.9637 10.16812 62490
female -1.60872 0.80036 -2.0100 0.0444346 -3.1774 -0.04002 62490
age -0.06923 0.04522 -1.5311 0.1257595 -0.1579 0.01940 62490
foreigner -2.77057 0.73286 -3.7805 0.0001567 -4.2070 -1.33417 62490
employability2 -0.77969 0.98917 -0.7882 0.4305667 -2.7185 1.15909 62490
employability3 -0.46895 1.52730 -0.3070 0.7588092 -3.4625 2.52457 62490
past_incomein10000 0.31291 0.17909 1.7473 0.0805960 -0.0381 0.66392 62490
Multiple R-squared: 0.0004017 , Adjusted R-squared: 0.0003058
F-statistic: 5.219 on 6 and 62490 DF, p-value: 2.215e-05
rm(temp_df,temp_ols)
Next, we reuse the ATE score as pseudo outcome in a one-dimensional kernel regression. The causalDML package implements this based on the np package.
## Kernel regression CATEs
list_kr_cates_inc = vector("list",4)
for (i in 1:4) {
list_kr_cates_inc[[i]] = kr_cate(cDML$ATE$delta[,i],x[,"past_income"])
}
The figures show the results in the following order: job search (Figure 3 (a)), vocational (Figure 3 (c)), computer (Figure 3 (e)) and language (Figure 3 (g)):
# Plot Kernel regression CATEs
for (i in 1:4) {
print(plot(list_kr_cates_inc[[i]],z_label="Past income",yrange=c(-5,10)))
}
Next, we reuse the ATE score as pseudo outcome in a one-dimensional series regression. The causalDML package implements this based on the crs package.
## Series regression GATEs
list_sr_cates_inc = vector("list",4)
for (i in 1:4) {
list_sr_cates_inc[[i]] = spline_cate(cDML$ATE$delta[,i],x[,"past_income"])
}
The figures show the results in the following order: job search (Figure 3 (b)), vocational (Figure 3 (d)), computer (Figure 3 (f)) and language (Figure 3 (h)):
# Plot Series regression GATEs
for (i in 1:4) {
print(plot(list_sr_cates_inc[[i]],z_label="Past income",yrange=c(-5,10)))
}
We further predict the individualized average treatment effects (IATEs) using the DR- and the NDR-learner. Like in the paper, we focus on the out-of-sample variant described in Algorithms 1 and 2 of the paper that are implemented using the ndr_learner function of the causalDML package.
## (N)DR-learner
ndr = ndr_learner(y,w,x,ml_w = list(forest),ml_y = list(forest),ml_tau = list(forest),compare_all = FALSE)
This produces Figure 5:
# Plot the results
df_box = NULL
for (i in 1:4) {
df = data.frame("DRL" = ndr$cates[i,,1], "NDRL" = ndr$cates[i,,2])
df = gather(df)
df = cbind(label_w[i+1],df)
colnames(df)[1] = "label"
df_box = rbind(df_box,df)
}
ggplot(data=df_box) + geom_boxplot( aes(x=factor(label,label_w[-1]),y=value,fill=key)) +
theme_bw() + theme(axis.title.x=element_blank(),legend.title = element_blank()) +
ylab("Individualized average treatment effect") + geom_hline(yintercept = 0) + geom_hline(yintercept = -31,linetype="dashed") +
geom_hline(yintercept = 31,linetype="dashed") +
theme(text=element_text(family="serif",size = 16, colour="black"),axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_grey(start = 0.9,end=0.4)
This replicates Table 6:
# Classification analysis
diff = matrix(NA,ncol(x),4)
rownames(diff) = label_x
colnames(diff) = label_w[-1]
for (i in 1:4) {
diff[,i] = clan(ndr$cates[i,,2],scale(x))[,5][-1] - clan(ndr$cates[i,,2],scale(x))[,1][-1]
}
round(diff[order(rowMaxs(abs(diff)),decreasing=T),][1:7,],digits=2)
Job search Vocational Computer Language
Past income -1.32 -0.84 -1.17 1.01
Previous job: unskilled worker 1.02 0.68 0.34 -1.24
Mother tongue other than German, French, Italian 0.69 0.68 0.00 -1.17
Qualification: some degree -0.88 -0.65 -0.41 1.15
Swiss citizen -0.66 -0.60 0.12 1.11
Fraction of months employed last 2 years -1.06 -0.37 -0.47 0.30
Qualification: unskilled 0.81 0.41 0.32 -1.02
Finally, the gamma matrix of the APO_dml object can be used as input for the policytree R package.
## Estimate optimal assignment rule estimation
# Depth one
pt_low1 = policy_tree(x_pt_low,cDML$APO$gamma,depth = 1)
pt_high1 = policy_tree(x_pt_high,cDML$APO$gamma,depth = 1)
# Depth two
pt_low2 = policy_tree(x_pt_low,cDML$APO$gamma,depth = 2)
pt_high2 = policy_tree(x_pt_high,cDML$APO$gamma,depth = 2)
For the depth three tree, we round past income to CHF 1000 to speed up computation:
# Depth three
# Round past income variable to 1000 such that it runs in finite time
x_pt_low_round = x_pt_low
x_pt_low_round[,5] = round_any(x_pt_low[,5],1000)
pt_low3 = policy_tree(x_pt_low_round,cDML$APO$gamma,depth = 3)
x_pt_high_round = x_pt_high
x_pt_high_round[,3] = round_any(x_pt_high[,3],1000)
pt_high3 = policy_tree(x_pt_high_round,cDML$APO$gamma,depth = 3)
This replicates the four subgraphs of Figure 5:
## Plot optimal assignment rule estimation
plot(pt_low1,label_w)
plot(pt_high1,label_w)
plot(pt_low2,label_w)
plot(pt_high2,label_w)
This replicates Panel A of Table 7:
# Panel A
table(predict(pt_low1,newdata=x_pt_low)) / length(y) * 100
3 4
55.90668 44.09332
table(predict(pt_low2,newdata=x_pt_low)) / length(y) * 100
3 4 5
31.88793 42.99726 25.11481
table(predict(pt_low3,newdata=x_pt_low)) / length(y) * 100
3 4 5
46.61184 36.59696 16.79121
table(predict(pt_high1,newdata=x_pt_high)) / length(y) * 100
4 5
53.50817 46.49183
table(predict(pt_high2,newdata=x_pt_high)) / length(y) * 100
3 4 5
23.00270 36.58896 40.40834
table(predict(pt_high3,newdata=x_pt_high)) / length(y) * 100
3 4 5
42.20363 30.35346 27.44292
Finally, we apply a short program to cross-validate the policy trees.
## Cross-validate trees
# Short program implementing the cross-validation of trees
cv_policy_tree = function(APO_dml,x,...) {
nfolds = ncol(APO_dml$cf_mat)
folds = APO_dml$cf_mat
list_trees = vector("list",nfolds)
cv_policy = matrix(NA,nrow(APO_dml$cf_mat),nfolds)
cv_gamma = rep(0,nrow(APO_dml$cf_mat))
for (i in 1:nfolds) {
list_trees[[i]] = policy_tree(x[folds[,i]==0,],APO_dml$gamma[folds[,i]==0,],...)
cv_policy[,i] = predict(list_trees[[i]],x)
for (j in 1:ncol(APO_dml$gamma)) {
cv_gamma[folds[,i]==1] = cv_gamma[folds[,i]==1] + (cv_policy[folds[,i]==1,i] == j) * APO_dml$gamma[folds[,i]==1,j]
}
}
list("trees"=list_trees,"policy_mat"=cv_policy,"gamma"=cv_gamma)
}
cv_pt_low1 = cv_policy_tree(cDML$APO,x_pt_low,depth = 1)
cv_pt_high1 = cv_policy_tree(cDML$APO,x_pt_high,depth = 1)
cv_pt_low2 = cv_policy_tree(cDML$APO,x_pt_low,depth = 2)
cv_pt_high2 = cv_policy_tree(cDML$APO,x_pt_high,depth = 2)
cv_pt_low3 = cv_policy_tree(cDML$APO,x_pt_low_round,depth = 3)
cv_pt_high3 = cv_policy_tree(cDML$APO,x_pt_high_round,depth = 3)
This replicates the results shown in Panel B of Table 7:
# Panel B
cat("\n Depth 1 & 5 variables \n")
Depth 1 & 5 variables
cv_resultsl1 = summary(lm(cv_pt_low1$gamma - cDML$APO$gamma ~ 1))
for (i in 1:5) print(cv_resultsl1[[i]]$coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.748397 0.4048685 9.258308 2.140316e-20
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.767369 0.4174018 11.42153 3.497432e-30
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4877196 0.4369752 1.116127 0.2643722
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.315847 0.4753149 0.6645005 0.5063725
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.21336 0.4920543 2.465906 0.01366937
cat("\n Depth 2 & 5 variables \n")
Depth 2 & 5 variables
cv_resultsl2 = summary(lm(cv_pt_low2$gamma - cDML$APO$gamma ~ 1))
for (i in 1:5) print(cv_resultsl2[[i]]$coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.988737 0.4075943 9.786047 1.341782e-22
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.007709 0.4200489 11.92173 9.897219e-33
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.728059 0.4764307 1.528153 0.1264796
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5561864 0.4639521 1.198801 0.2306098
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.453699 0.4680231 3.106042 0.001896946
cat("\n Depth 3 & 5 variables \n")
Depth 3 & 5 variables
cv_resultsl3 = summary(lm(cv_pt_low3$gamma - cDML$APO$gamma ~ 1))
for (i in 1:5) print(cv_resultsl3[[i]]$coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.499932 0.4139583 8.454792 2.855177e-17
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.518903 0.4262032 10.6027 3.047711e-26
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2392538 0.4526634 0.5285467 0.5971218
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0673811 0.4899702 0.1375208 0.8906196
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.9648941 0.4706919 2.049948 0.04037364
cat("\n Depth 1 & 16 variables \n")
Depth 1 & 16 variables
cv_resultsh1 = summary(lm(cv_pt_high1$gamma - cDML$APO$gamma ~ 1))
for (i in 1:5) print(cv_resultsh1[[i]]$coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.720499 0.4104394 9.064673 1.284803e-19
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.739471 0.422858 11.20819 3.963171e-29
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4598215 0.5189743 0.8860198 0.3756103
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2879488 0.4803275 0.5994844 0.5488521
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.185462 0.4043294 2.931921 0.003369945
cat("\n Depth 2 & 16 variables \n")
Depth 2 & 16 variables
cv_resultsh2 = summary(lm(cv_pt_high2$gamma - cDML$APO$gamma ~ 1))
for (i in 1:5) print(cv_resultsh2[[i]]$coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.632815 0.426687 8.514005 1.716914e-17
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.651787 0.4386062 10.60584 2.947185e-26
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.3721376 0.4706731 0.7906499 0.4291513
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2002649 0.5256943 0.3809532 0.7032393
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.097778 0.4234822 2.592265 0.009536858
cat("\n Depth 3 & 16 variables \n")
Depth 3 & 16 variables
cv_resultsh3 = summary(lm(cv_pt_high3$gamma - cDML$APO$gamma ~ 1))
for (i in 1:5) print(cv_resultsh3[[i]]$coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.942688 0.4226603 9.328266 1.110005e-20
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.961659 0.4347275 11.41326 3.845809e-30
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.6820098 0.4674601 1.458969 0.1445787
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5101371 0.4932374 1.034263 0.3010173
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.40765 0.4603621 3.057702 0.002231354