Goal:
- See regression trees and random forests in action
Introducing the data
The Application Notebook builds on the dataset that is kindly
provided in the hdm
package. The data was used in Chernozhukov
and Hansen (2004). Their paper investigates the effect of
participation in the employer-sponsored 401(k) retirement savings plan
(p401
) on net assets (net_tfa
). Since then the
data was used to showcase many new methods. It is not the most
comprehensive dataset with basically ten
covariates/regressors/predictors:
age: age
db: defined benefit pension
educ: education (in years)
fsize: family size
hown: home owner
inc: income (in US $)
male: male
marr: married
pira: participation in individual retirement account
(IRA)
twoearn: two earners
However, it is publicly available and the relatively few covariates
ensure that the programs do not run too long.
We consider wealth in $1000 for better representation of the
results.
# Load the packages required for later
if (!require("hdm")) install.packages("hdm", dependencies = TRUE); library(hdm)
if (!require("tidyverse")) install.packages("tidyverse", dependencies = TRUE); library(tidyverse)
if (!require("rpart")) install.packages("rpart", dependencies = TRUE); library(rpart)
if (!require("partykit")) install.packages("partykit", dependencies = TRUE); library(partykit)
if (!require("grf")) install.packages("grf", dependencies = TRUE); library(grf)
if (!require("rpart.plot")) install.packages("rpart.plot", dependencies = TRUE); library(rpart.plot)
set.seed(1234) # for replicability
data(pension)
Y = pension$net_tfa / 1000
# Create covariate matrix
X = model.matrix(~ 0 + age + db + educ + fsize + hown + inc + male + marr + pira + twoearn, data = pension)
Note that we only need the main effects in this notebook as
tree-based methods are supposed to find relevant interactions
themselves.
Regression tree
Let’s build our first tree using the rpart
package and
rpart.plot
for improved visualization. The tree is pruned
using the default cross-validation:
tree = rpart(Y ~ X)
rpart.plot(tree)
We observe that the cross-validated tree creates five leafs. Most of
the individuals have no individual retirement account and are all
subsumed into the left leaf where the mean (and thus the predicted
wealth) is $5.6k. Those with individual accounts are then further split
according to their income.
When plotting the fitted values against the observed ones, we clearly
see the discrete nature of a single tree:
yhat_tree = predict(tree)
plot(yhat_tree,Y)
Random Forest regression
Random Forest take the average over many trees that are build on
random subsamples using only random subsets of covariates to split.
There are many different implementations of Random Forests. Here we
use the regression_forest
function of the grf
package because the package is also important for the causal ML
part.
The random forest predictions are much smoother and overcome the
discrete nature of a single tree:
rf = regression_forest(X,Y,tune.parameters = "all")
yhat_rf = predict(rf)$predictions
plot(yhat_rf,Y)
Random Forests provide different ways to calculate variable
importance measures. The details go beyond the scope of this
notebook.
However, let’s have a look at the variable importance measure
provided by the grf
package. It sums up to one and roughly
describes which variables are most often used to split in the thousands
of trees resulting in the forest:
vi = variable_importance(rf)
rownames(vi) = colnames(X)
round(vi,3)
[,1]
age 0.078
db 0.004
educ 0.021
fsize 0.007
hown 0.041
inc 0.371
male 0.002
marr 0.006
pira 0.459
twoearn 0.009
Clearly the most predictive variables are income and the
participation in an IRA. The variable importance measure is a
descriptive tool to open the black box of a random forest a little bit.
Still, we can not do inference on these kind of measures. But we are not
really interested in how Random Forests do their job in forming
good predictions anyways, the important thing is that they do
their job.
Compare performance
Like in the Supervised
ML: Lasso Application Notebook, we check how different methods
perform out-of-sample by splitting the data set in 100 random training
and test sets to check the out-of-sample \(R^2\).
We compare three different methods:
# Define training (2/3) and test (1/3) sample split
test_fraction = 1/3
test_size = floor(test_fraction * length(Y))
# Here we define some useful function to keep the code clean
# They run the method in the training sample and calcualte the test set R2
tree_oos_r2 = function(x_tr,y_tr,x_te,y_te) {
df_tr = data.frame(x_tr,y_tr)
df_te = data.frame(x_te,y_te)
tree = rpart(y_tr ~ ., df_tr)
y_hat = predict(tree, newdata = df_te)
mse = mean( (y_te - y_hat)^2 )
return(1 - mse / var(y_te))
}
rf_oos_r2 = function(x_tr,y_tr,x_te,y_te) {
rf = regression_forest(x_tr,y_tr,tune.parameters = "all", honesty = FALSE)
y_hat = predict(rf, newdata = x_te)$predictions
mse = mean( (y_te - y_hat)^2 )
return(1 - mse / var(y_te))
}
rfh_oos_r2 = function(x_tr,y_tr,x_te,y_te) {
rf = regression_forest(x_tr,y_tr,tune.parameters = "all", honesty = TRUE)
y_hat = predict(rf, newdata = x_te)$predictions
mse = mean( (y_te - y_hat)^2 )
return(1 - mse / var(y_te))
}
rep = 100 # number of replications
# Container of the results
results_r2 = matrix(NA,rep,3)
colnames(results_r2) = c("Tree","Forest","Honest Forest")
# Loop considering different splits
for (i in 1:rep) {
# Draw index for this round
temp_ind = sample(1:length(Y), size = test_size)
# Split into training and test samples
X_tr = X[-temp_ind,]
X_te = X[temp_ind,]
Y_tr = Y[-temp_ind]
Y_te = Y[temp_ind]
results_r2[i,1] = tree_oos_r2(X_tr,Y_tr,X_te,Y_te)
results_r2[i,2] = rf_oos_r2(X_tr,Y_tr,X_te,Y_te)
results_r2[i,3] = rfh_oos_r2(X_tr,Y_tr,X_te,Y_te)
}
A look at the mean \(R^2\) reveals
that - as expected - the single tree performs much worse than the two
random forests that are very similar:
round(colMeans(results_r2),3)
Tree Forest Honest Forest
0.179 0.250 0.260
Looking at the boxplot shows that Honest Random Forests seem to be
preferable as they are more stable than “dishonest” forests:
as.data.frame(results_r2) %>% pivot_longer(cols=everything(),names_to = "Method",values_to = "R2") %>%
ggplot(aes(x = R2, y = Method)) + geom_boxplot()
However, Lasso-based methods achieved on average up to 29% \(R^2\) compared to 26% of the tree-based
methods. Thus, it seems that global methods are slightly better suited
to predict wealth in this application.
LS0tDQp0aXRsZTogIlN1cGVydmlzZWQgTUw6IFRyZWUtYmFzZWQgbWV0aG9kcyINCnN1YnRpdGxlOiAiQXBwbGljYXRpb24gbm90ZWJvb2siDQphdXRob3I6ICJNaWNoYWVsIEtuYXVzIg0KZGF0ZTogImByIGZvcm1hdChTeXMudGltZSgpLCAnJW0vJXknKWAiDQpvdXRwdXQ6IA0KICBodG1sX25vdGVib29rOg0KICAgIHRvYzogdHJ1ZQ0KICAgIHRvY19mbG9hdDogdHJ1ZQ0KICAgIGNvZGVfZm9sZGluZzogc2hvdw0KLS0tDQoNCkdvYWw6DQoNCi0gU2VlIHJlZ3Jlc3Npb24gdHJlZXMgYW5kIHJhbmRvbSBmb3Jlc3RzIGluIGFjdGlvbg0KDQo8YnI+DQoNCiMjIEludHJvZHVjaW5nIHRoZSBkYXRhDQoNClRoZSBBcHBsaWNhdGlvbiBOb3RlYm9vayBidWlsZHMgb24gdGhlIGRhdGFzZXQgdGhhdCBpcyBraW5kbHkgcHJvdmlkZWQgaW4gdGhlIGBoZG1gIHBhY2thZ2UuIFRoZSBkYXRhIHdhcyB1c2VkIGluIFtDaGVybm96aHVrb3YgYW5kIEhhbnNlbiAoMjAwNCldKGh0dHBzOi8vZGlyZWN0Lm1pdC5lZHUvcmVzdC9hcnRpY2xlLzg2LzMvNzM1LzU3NTg2L1RoZS1FZmZlY3RzLW9mLTQwMS1LLVBhcnRpY2lwYXRpb24tb24tdGhlLVdlYWx0aCkuIFRoZWlyIHBhcGVyIGludmVzdGlnYXRlcyB0aGUgZWZmZWN0IG9mIHBhcnRpY2lwYXRpb24gaW4gdGhlIGVtcGxveWVyLXNwb25zb3JlZCA0MDEoaykgcmV0aXJlbWVudCBzYXZpbmdzIHBsYW4gKGBwNDAxYCkgb24gbmV0IGFzc2V0cyAoYG5ldF90ZmFgKS4gU2luY2UgdGhlbiB0aGUgZGF0YSB3YXMgdXNlZCB0byBzaG93Y2FzZSBtYW55IG5ldyBtZXRob2RzLiBJdCBpcyBub3QgdGhlIG1vc3QgY29tcHJlaGVuc2l2ZSBkYXRhc2V0IHdpdGggYmFzaWNhbGx5IHRlbiBjb3ZhcmlhdGVzL3JlZ3Jlc3NvcnMvcHJlZGljdG9yczoNCg0KLSAqYWdlKjogYWdlDQoNCi0gKmRiKjogZGVmaW5lZCBiZW5lZml0IHBlbnNpb24NCg0KLSAqZWR1Yyo6IGVkdWNhdGlvbiAoaW4geWVhcnMpDQoNCi0gKmZzaXplKjogZmFtaWx5IHNpemUNCg0KLSAqaG93bio6IGhvbWUgb3duZXINCg0KLSAqaW5jKjogaW5jb21lIChpbiBVUyAkKQ0KDQotICptYWxlKjogbWFsZQ0KDQotICptYXJyKjogbWFycmllZA0KDQotICpwaXJhKjogcGFydGljaXBhdGlvbiBpbiBpbmRpdmlkdWFsIHJldGlyZW1lbnQgYWNjb3VudCAoSVJBKQ0KDQotICp0d29lYXJuKjogdHdvIGVhcm5lcnMNCg0KSG93ZXZlciwgaXQgaXMgcHVibGljbHkgYXZhaWxhYmxlIGFuZCB0aGUgcmVsYXRpdmVseSBmZXcgY292YXJpYXRlcyBlbnN1cmUgdGhhdCB0aGUgcHJvZ3JhbXMgZG8gbm90IHJ1biB0b28gbG9uZy4NCg0KV2UgY29uc2lkZXIgd2VhbHRoIGluICQxMDAwIGZvciBiZXR0ZXIgcmVwcmVzZW50YXRpb24gb2YgdGhlIHJlc3VsdHMuDQoNCmBgYHtyLCB3YXJuaW5nID0gRiwgbWVzc2FnZSA9IEZ9DQojIExvYWQgdGhlIHBhY2thZ2VzIHJlcXVpcmVkIGZvciBsYXRlcg0KaWYgKCFyZXF1aXJlKCJoZG0iKSkgaW5zdGFsbC5wYWNrYWdlcygiaGRtIiwgZGVwZW5kZW5jaWVzID0gVFJVRSk7IGxpYnJhcnkoaGRtKQ0KaWYgKCFyZXF1aXJlKCJ0aWR5dmVyc2UiKSkgaW5zdGFsbC5wYWNrYWdlcygidGlkeXZlcnNlIiwgZGVwZW5kZW5jaWVzID0gVFJVRSk7IGxpYnJhcnkodGlkeXZlcnNlKQ0KaWYgKCFyZXF1aXJlKCJycGFydCIpKSBpbnN0YWxsLnBhY2thZ2VzKCJycGFydCIsIGRlcGVuZGVuY2llcyA9IFRSVUUpOyBsaWJyYXJ5KHJwYXJ0KQ0KaWYgKCFyZXF1aXJlKCJwYXJ0eWtpdCIpKSBpbnN0YWxsLnBhY2thZ2VzKCJwYXJ0eWtpdCIsIGRlcGVuZGVuY2llcyA9IFRSVUUpOyBsaWJyYXJ5KHBhcnR5a2l0KQ0KaWYgKCFyZXF1aXJlKCJncmYiKSkgaW5zdGFsbC5wYWNrYWdlcygiZ3JmIiwgZGVwZW5kZW5jaWVzID0gVFJVRSk7IGxpYnJhcnkoZ3JmKQ0KaWYgKCFyZXF1aXJlKCJycGFydC5wbG90IikpIGluc3RhbGwucGFja2FnZXMoInJwYXJ0LnBsb3QiLCBkZXBlbmRlbmNpZXMgPSBUUlVFKTsgbGlicmFyeShycGFydC5wbG90KQ0KDQpzZXQuc2VlZCgxMjM0KSAjIGZvciByZXBsaWNhYmlsaXR5DQoNCmRhdGEocGVuc2lvbikNClkgPSBwZW5zaW9uJG5ldF90ZmEgLyAxMDAwDQojIENyZWF0ZSBjb3ZhcmlhdGUgbWF0cml4DQpYID0gbW9kZWwubWF0cml4KH4gMCArIGFnZSArIGRiICsgZWR1YyArIGZzaXplICsgaG93biArIGluYyArIG1hbGUgKyBtYXJyICsgcGlyYSArIHR3b2Vhcm4sIGRhdGEgPSBwZW5zaW9uKQ0KYGBgDQoNCk5vdGUgdGhhdCB3ZSBvbmx5IG5lZWQgdGhlIG1haW4gZWZmZWN0cyBpbiB0aGlzIG5vdGVib29rIGFzIHRyZWUtYmFzZWQgbWV0aG9kcyBhcmUgc3VwcG9zZWQgdG8gZmluZCByZWxldmFudCBpbnRlcmFjdGlvbnMgdGhlbXNlbHZlcy4NCg0KPGJyPg0KPGJyPg0KDQojIyBSZWdyZXNzaW9uIHRyZWUNCg0KTGV0J3MgYnVpbGQgb3VyIGZpcnN0IHRyZWUgdXNpbmcgdGhlIGBycGFydGAgcGFja2FnZSBhbmQgYHJwYXJ0LnBsb3RgIGZvciBpbXByb3ZlZCB2aXN1YWxpemF0aW9uLiBUaGUgdHJlZSBpcyBwcnVuZWQgdXNpbmcgdGhlIGRlZmF1bHQgY3Jvc3MtdmFsaWRhdGlvbjoNCg0KYGBge3J9DQp0cmVlID0gcnBhcnQoWSB+IFgpDQpycGFydC5wbG90KHRyZWUpDQpgYGANCg0KV2Ugb2JzZXJ2ZSB0aGF0IHRoZSBjcm9zcy12YWxpZGF0ZWQgdHJlZSBjcmVhdGVzIGZpdmUgbGVhZnMuIE1vc3Qgb2YgdGhlIGluZGl2aWR1YWxzIGhhdmUgbm8gaW5kaXZpZHVhbCByZXRpcmVtZW50IGFjY291bnQgYW5kIGFyZSBhbGwgc3Vic3VtZWQgaW50byB0aGUgbGVmdCBsZWFmIHdoZXJlIHRoZSBtZWFuIChhbmQgdGh1cyB0aGUgcHJlZGljdGVkIHdlYWx0aCkgaXMgJDUuNmsuIFRob3NlIHdpdGggaW5kaXZpZHVhbCBhY2NvdW50cyBhcmUgdGhlbiBmdXJ0aGVyIHNwbGl0IGFjY29yZGluZyB0byB0aGVpciBpbmNvbWUuDQoNCldoZW4gcGxvdHRpbmcgdGhlIGZpdHRlZCB2YWx1ZXMgYWdhaW5zdCB0aGUgb2JzZXJ2ZWQgb25lcywgd2UgY2xlYXJseSBzZWUgdGhlIGRpc2NyZXRlIG5hdHVyZSBvZiBhIHNpbmdsZSB0cmVlOg0KDQoNCmBgYHtyfQ0KeWhhdF90cmVlID0gcHJlZGljdCh0cmVlKQ0KcGxvdCh5aGF0X3RyZWUsWSkNCmBgYA0KDQo8YnI+DQo8YnI+DQoNCiMjIFJhbmRvbSBGb3Jlc3QgcmVncmVzc2lvbg0KDQpSYW5kb20gRm9yZXN0IHRha2UgdGhlIGF2ZXJhZ2Ugb3ZlciBtYW55IHRyZWVzIHRoYXQgYXJlIGJ1aWxkIG9uIHJhbmRvbSBzdWJzYW1wbGVzIHVzaW5nIG9ubHkgcmFuZG9tIHN1YnNldHMgb2YgY292YXJpYXRlcyB0byBzcGxpdC4NCg0KVGhlcmUgYXJlIG1hbnkgZGlmZmVyZW50IGltcGxlbWVudGF0aW9ucyBvZiBSYW5kb20gRm9yZXN0cy4gSGVyZSB3ZSB1c2UgdGhlIGByZWdyZXNzaW9uX2ZvcmVzdGAgZnVuY3Rpb24gb2YgdGhlIGBncmZgIHBhY2thZ2UgYmVjYXVzZSB0aGUgcGFja2FnZSBpcyBhbHNvIGltcG9ydGFudCBmb3IgdGhlIGNhdXNhbCBNTCBwYXJ0Lg0KDQpUaGUgcmFuZG9tIGZvcmVzdCBwcmVkaWN0aW9ucyBhcmUgbXVjaCBzbW9vdGhlciBhbmQgb3ZlcmNvbWUgdGhlIGRpc2NyZXRlIG5hdHVyZSBvZiBhIHNpbmdsZSB0cmVlOg0KDQoNCmBgYHtyfQ0KcmYgPSByZWdyZXNzaW9uX2ZvcmVzdChYLFksdHVuZS5wYXJhbWV0ZXJzID0gImFsbCIpDQp5aGF0X3JmID0gcHJlZGljdChyZikkcHJlZGljdGlvbnMNCnBsb3QoeWhhdF9yZixZKQ0KYGBgDQoNClJhbmRvbSBGb3Jlc3RzIHByb3ZpZGUgZGlmZmVyZW50IHdheXMgdG8gY2FsY3VsYXRlIHZhcmlhYmxlIGltcG9ydGFuY2UgbWVhc3VyZXMuIFRoZSBkZXRhaWxzIGdvIGJleW9uZCB0aGUgc2NvcGUgb2YgdGhpcyBub3RlYm9vay4NCg0KSG93ZXZlciwgbGV0J3MgaGF2ZSBhIGxvb2sgYXQgdGhlIHZhcmlhYmxlIGltcG9ydGFuY2UgbWVhc3VyZSBwcm92aWRlZCBieSB0aGUgYGdyZmAgcGFja2FnZS4gSXQgc3VtcyB1cCB0byBvbmUgYW5kIHJvdWdobHkgZGVzY3JpYmVzIHdoaWNoIHZhcmlhYmxlcyBhcmUgbW9zdCBvZnRlbiB1c2VkIHRvIHNwbGl0IGluIHRoZSB0aG91c2FuZHMgb2YgdHJlZXMgcmVzdWx0aW5nIGluIHRoZSBmb3Jlc3Q6DQoNCmBgYHtyfQ0KdmkgPSB2YXJpYWJsZV9pbXBvcnRhbmNlKHJmKQ0Kcm93bmFtZXModmkpID0gY29sbmFtZXMoWCkNCnJvdW5kKHZpLDMpDQpgYGANCg0KQ2xlYXJseSB0aGUgbW9zdCBwcmVkaWN0aXZlIHZhcmlhYmxlcyBhcmUgaW5jb21lIGFuZCB0aGUgcGFydGljaXBhdGlvbiBpbiBhbiBJUkEuIFRoZSB2YXJpYWJsZSBpbXBvcnRhbmNlIG1lYXN1cmUgaXMgYSBkZXNjcmlwdGl2ZSB0b29sIHRvIG9wZW4gdGhlIGJsYWNrIGJveCBvZiBhIHJhbmRvbSBmb3Jlc3QgYSBsaXR0bGUgYml0LiBTdGlsbCwgd2UgY2FuIG5vdCBkbyBpbmZlcmVuY2Ugb24gdGhlc2Uga2luZCBvZiBtZWFzdXJlcy4gQnV0IHdlIGFyZSBub3QgcmVhbGx5IGludGVyZXN0ZWQgaW4gKmhvdyogUmFuZG9tIEZvcmVzdHMgZG8gdGhlaXIgam9iIGluIGZvcm1pbmcgZ29vZCBwcmVkaWN0aW9ucyBhbnl3YXlzLCB0aGUgaW1wb3J0YW50IHRoaW5nIGlzICp0aGF0KiB0aGV5IGRvIHRoZWlyIGpvYi4NCg0KPGJyPg0KPGJyPg0KDQojIyBDb21wYXJlIHBlcmZvcm1hbmNlDQoNCkxpa2UgaW4gdGhlIFtTdXBlcnZpc2VkIE1MOiBMYXNzb10oaHR0cHM6Ly9tY2tuYXVzLmdpdGh1Yi5pby9hc3NldHMvbm90ZWJvb2tzL2FwcGw0MDFrL0FOQl80MDFrX0xhc3NvLm5iLmh0bWwpIEFwcGxpY2F0aW9uIE5vdGVib29rLCB3ZSBjaGVjayBob3cgZGlmZmVyZW50IG1ldGhvZHMgcGVyZm9ybSBvdXQtb2Ytc2FtcGxlIGJ5IHNwbGl0dGluZyB0aGUgZGF0YSBzZXQgaW4gMTAwIHJhbmRvbSB0cmFpbmluZyBhbmQgdGVzdCBzZXRzIHRvIGNoZWNrIHRoZSBvdXQtb2Ytc2FtcGxlICRSXjIkLg0KDQpXZSBjb21wYXJlIHRocmVlIGRpZmZlcmVudCBtZXRob2RzOg0KDQotIFRyZWU6IE9uZSBzaW5nbGUgdHJlZQ0KDQotIFJhbmRvbSBGb3Jlc3Q6IFN0YW5kYXJkIHJhbmRvbSBmb3Jlc3QNCg0KLSBIb25lc3QgUmFuZG9tIEZvcmVzdDogUmFuZG9tIEZvcmVzdCB3aGVyZSB0aGUgdHJlZSBzdHJ1Y3R1cmUgaXMgbGVhcm5lZCBvbiBhIGRpZmZlcmVudCBzdWJzYW1wbGUgdGhlbiB0aGUgbGVhZiBtZWFucw0KDQoNCmBgYHtyfQ0KIyBEZWZpbmUgdHJhaW5pbmcgKDIvMykgYW5kIHRlc3QgKDEvMykgc2FtcGxlIHNwbGl0DQp0ZXN0X2ZyYWN0aW9uID0gMS8zDQp0ZXN0X3NpemUgPSBmbG9vcih0ZXN0X2ZyYWN0aW9uICogbGVuZ3RoKFkpKQ0KDQojIEhlcmUgd2UgZGVmaW5lIHNvbWUgdXNlZnVsIGZ1bmN0aW9uIHRvIGtlZXAgdGhlIGNvZGUgY2xlYW4NCiMgVGhleSBydW4gdGhlIG1ldGhvZCBpbiB0aGUgdHJhaW5pbmcgc2FtcGxlIGFuZCBjYWxjdWFsdGUgdGhlIHRlc3Qgc2V0IFIyDQp0cmVlX29vc19yMiA9IGZ1bmN0aW9uKHhfdHIseV90cix4X3RlLHlfdGUpIHsNCiAgZGZfdHIgPSBkYXRhLmZyYW1lKHhfdHIseV90cikNCiAgZGZfdGUgPSBkYXRhLmZyYW1lKHhfdGUseV90ZSkNCiAgdHJlZSA9IHJwYXJ0KHlfdHIgfiAuLCBkZl90cikNCiAgeV9oYXQgPSBwcmVkaWN0KHRyZWUsIG5ld2RhdGEgPSBkZl90ZSkNCiAgbXNlID0gbWVhbiggKHlfdGUgLSB5X2hhdCleMiApDQogIHJldHVybigxIC0gbXNlIC8gdmFyKHlfdGUpKQ0KfQ0KDQpyZl9vb3NfcjIgPSBmdW5jdGlvbih4X3RyLHlfdHIseF90ZSx5X3RlKSB7DQogIHJmID0gcmVncmVzc2lvbl9mb3Jlc3QoeF90cix5X3RyLHR1bmUucGFyYW1ldGVycyA9ICJhbGwiLCBob25lc3R5ID0gRkFMU0UpDQogIHlfaGF0ID0gcHJlZGljdChyZiwgbmV3ZGF0YSA9IHhfdGUpJHByZWRpY3Rpb25zDQogIG1zZSA9IG1lYW4oICh5X3RlIC0geV9oYXQpXjIgKQ0KICByZXR1cm4oMSAtIG1zZSAvIHZhcih5X3RlKSkNCn0NCg0KcmZoX29vc19yMiA9IGZ1bmN0aW9uKHhfdHIseV90cix4X3RlLHlfdGUpIHsNCiAgcmYgPSByZWdyZXNzaW9uX2ZvcmVzdCh4X3RyLHlfdHIsdHVuZS5wYXJhbWV0ZXJzID0gImFsbCIsIGhvbmVzdHkgPSBUUlVFKQ0KICB5X2hhdCA9IHByZWRpY3QocmYsIG5ld2RhdGEgPSB4X3RlKSRwcmVkaWN0aW9ucw0KICBtc2UgPSBtZWFuKCAoeV90ZSAtIHlfaGF0KV4yICkNCiAgcmV0dXJuKDEgLSBtc2UgLyB2YXIoeV90ZSkpDQp9DQoNCnJlcCA9IDEwMCAjIG51bWJlciBvZiByZXBsaWNhdGlvbnMNCg0KIyBDb250YWluZXIgb2YgdGhlIHJlc3VsdHMNCnJlc3VsdHNfcjIgPSBtYXRyaXgoTkEscmVwLDMpDQpjb2xuYW1lcyhyZXN1bHRzX3IyKSA9IGMoIlRyZWUiLCJGb3Jlc3QiLCJIb25lc3QgRm9yZXN0IikNCg0KIyBMb29wIGNvbnNpZGVyaW5nIGRpZmZlcmVudCBzcGxpdHMNCmZvciAoaSBpbiAxOnJlcCkgew0KICAjIERyYXcgaW5kZXggZm9yIHRoaXMgcm91bmQNCiAgdGVtcF9pbmQgPSBzYW1wbGUoMTpsZW5ndGgoWSksIHNpemUgPSB0ZXN0X3NpemUpDQoNCiAgIyBTcGxpdCBpbnRvIHRyYWluaW5nIGFuZCB0ZXN0IHNhbXBsZXMNCiAgWF90ciA9IFhbLXRlbXBfaW5kLF0NCiAgWF90ZSA9IFhbdGVtcF9pbmQsXQ0KICBZX3RyID0gWVstdGVtcF9pbmRdDQogIFlfdGUgPSBZW3RlbXBfaW5kXQ0KICANCiAgcmVzdWx0c19yMltpLDFdID0gdHJlZV9vb3NfcjIoWF90cixZX3RyLFhfdGUsWV90ZSkNCiAgcmVzdWx0c19yMltpLDJdID0gcmZfb29zX3IyKFhfdHIsWV90cixYX3RlLFlfdGUpDQogIHJlc3VsdHNfcjJbaSwzXSA9IHJmaF9vb3NfcjIoWF90cixZX3RyLFhfdGUsWV90ZSkNCn0NCmBgYA0KDQoNCkEgbG9vayBhdCB0aGUgbWVhbiAkUl4yJCByZXZlYWxzIHRoYXQgLSBhcyBleHBlY3RlZCAtIHRoZSBzaW5nbGUgdHJlZSBwZXJmb3JtcyBtdWNoIHdvcnNlIHRoYW4gdGhlIHR3byByYW5kb20gZm9yZXN0cyB0aGF0IGFyZSB2ZXJ5IHNpbWlsYXI6DQoNCg0KYGBge3J9DQpyb3VuZChjb2xNZWFucyhyZXN1bHRzX3IyKSwzKQ0KYGBgDQpMb29raW5nIGF0IHRoZSBib3hwbG90IHNob3dzIHRoYXQgSG9uZXN0IFJhbmRvbSBGb3Jlc3RzIHNlZW0gdG8gYmUgcHJlZmVyYWJsZSBhcyB0aGV5IGFyZSBtb3JlIHN0YWJsZSB0aGFuICJkaXNob25lc3QiIGZvcmVzdHM6DQoNCg0KYGBge3J9DQphcy5kYXRhLmZyYW1lKHJlc3VsdHNfcjIpICU+JSBwaXZvdF9sb25nZXIoY29scz1ldmVyeXRoaW5nKCksbmFtZXNfdG8gPSAiTWV0aG9kIix2YWx1ZXNfdG8gPSAiUjIiKSAlPiUNCiAgZ2dwbG90KGFlcyh4ID0gUjIsIHkgPSBNZXRob2QpKSArIGdlb21fYm94cGxvdCgpDQpgYGANCg0KDQpIb3dldmVyLCBMYXNzby1iYXNlZCBtZXRob2RzIGFjaGlldmVkIG9uIGF2ZXJhZ2UgdXAgdG8gMjklICRSXjIkIGNvbXBhcmVkIHRvIDI2JSBvZiB0aGUgdHJlZS1iYXNlZCBtZXRob2RzLiBUaHVzLCBpdCBzZWVtcyB0aGF0IGdsb2JhbCBtZXRob2RzIGFyZSBzbGlnaHRseSBiZXR0ZXIgc3VpdGVkIHRvIHByZWRpY3Qgd2VhbHRoIGluIHRoaXMgYXBwbGljYXRpb24uDQoNCjxicj4NCjxicj4NCg0KIA==