Goals:

  • Illustrate how Lasso works

  • Illustrate that Lasso does not overfit


Data generating process

We use the same DGP as in the notebook Overfitting of OLS and value of training vs. test sample:

  • \(p\geq10\) independent and standard normal covariates: \(X \sim N(0,I_p)\), where \(I_p\) is the \(p\)-dimensional identity matrix

  • The outcome model is \(Y = \underbrace{\beta_0 + \beta_1 X_1 + ... + \beta_{10} X_{10}}_{\text{conditional expectation function }m(X)} + e\), where \(X_j\) is the \(j\)-th column of \(X\) and \(e \sim N(0,1)\)

  • We consider the following parameters \(\beta_0 = 0\), \(\beta_1 = 1\), \(\beta_2 = 0.9\), …, \(\beta_9 = 0.2\), \(\beta_{10} = 0.1\) such that the first 10 variables have a decreasing impact on the outcome and any further variables are just irrelevant noise with no true predictive power

We set \(p=99\) and draw a training sample with \(N_{tr}=100\) and a test sample with \(N_{te}=10,000\)

# Load the packages required for later
if (!require("tidyverse")) install.packages("tidyverse", dependencies = TRUE); library(tidyverse)
if (!require("glmnet")) install.packages("glmnet", dependencies = TRUE); library(glmnet)
if (!require("hdm")) install.packages("hdm", dependencies = TRUE); library(hdm)
if (!require("plasso")) install.packages("plasso", dependencies = TRUE); library(plasso)

set.seed(1234) # For replicability

# Define the important parameters
n_tr = 100
n_te = 10000
p = 99
beta = c(0,seq(1,0.1,-0.1),rep(0,p-10))

# Combine constant and randomly drawn covariates
x_tr = cbind(rep(1,n_tr),matrix(rnorm(n_tr*p),ncol=p))
x_te = cbind(rep(1,n_te),matrix(rnorm(n_te*p),ncol=p))

# Create the CEF using matrix multiplication for compactness
cfe_tr = x_tr %*% beta
cfe_te = x_te %*% beta

# Create the "observed" outcomes by adding noise
y_tr = cfe_tr + rnorm(n_tr,0,1)
y_te = cfe_te + rnorm(n_te,0,1)


Lasso at work

We run Lasso with all 99 regressors. The glmnet command calculates the trajectory from an empty model to a nearly unpenalized model (read from right to left):

# As most of the commands that we use from now on, 
# glmnet takes inputs in matrix form and not as formulas
lasso = glmnet(x_tr,y_tr)
plot(lasso, xvar = "lambda")

In the beginning the coefficients of the really relevant variables are build up and at some point the excessive overfitting starts.

Cross-validation helps to figure out the sweet spot that ensures a good out-of-sample performance. The cv.glmnet command uses by default 10-fold cross-validation leading to the following result:

cv_lasso = cv.glmnet(x_tr,y_tr)
plot(cv_lasso)

We observe that the cross-validated MSE decreases as the relevant variables are selected and build up. At some point the reduced penalization leads to selection of noise variables and the cross-validated MSE deteriorated again. We select the penalty term where the curve indicates the lowest MSE (alternative is the 1SE rule indicated by the right dashed line, but we don’t go into the details here).


Post-Lasso at work

Cross-validated

Cross-validation for Post-Lasso is implemented in the plasso package:

# Provide the cov matrix w/o the constant x_tr[,-1]
# Increasing lambda.min.ratio ensures that Lasso does not overfit too heavily and reduces running time
post_lasso = plasso(x_tr[,-1],y_tr,lambda.min.ratio=0.01)
plot(post_lasso,xvar = "lambda")

While Lasso builds up coefficients gradually, Post-Lasso gives the full OLS coefficient as soon as the variable is selected \(\Rightarrow\) the coefficients paths of Post-Lasso are usually not smooth.

This is also observable in the cross-validation curve:

cv_plasso = cv.plasso(x_tr,y_tr,lambda.min.ratio=0.01)
plot(cv_plasso, legend_pos="bottomleft")

The comparison of Lasso and Post-Lasso cross-validation is instructive to understand some differences:

  • The Post-Lasso cross-validation curve is more bumpy and has flat regions compared to the Lasso curve. Post-Lasso gives the full OLS coefficient as soon as the variable is selected. This explains the flat regions of the Post-Lasso curve where no additional variable is selected and thus MSE stays constant.

  • Post-Lasso usually produces sparser cross-validated models because the unshrunken coefficients deliver more explanatory power with a smaller set of variables.


Fast implementation with hdm

While the cross-validated Post-Lasso is instructive, it is not fast and the rlasso function of the hdm package implements a faster way to choose the penalty parameter in a data-driven way:

post_hdm = rlasso(x_tr,y_tr)
summary(post_hdm)

Call:
rlasso.default(x = x_tr, y = y_tr)

Post-Lasso Estimation:  TRUE 

Total number of variables: 100
Number of selected variables: 6 

Residuals: 
     Min       1Q   Median       3Q      Max 
-2.78672 -0.76531 -0.04826  0.63651  3.37613 

            Estimate
(Intercept)   -0.097
V1             0.000
V2             0.994
V3             0.887
V4             0.833
V5             0.781
V6             0.692
V7             0.499
V8             0.000
V9             0.000
V10            0.000
V11            0.000
V12            0.000
V13            0.000
V14            0.000
V15            0.000
V16            0.000
V17            0.000
V18            0.000
V19            0.000
V20            0.000
V21            0.000
V22            0.000
V23            0.000
V24            0.000
V25            0.000
V26            0.000
V27            0.000
V28            0.000
V29            0.000
V30            0.000
V31            0.000
V32            0.000
V33            0.000
V34            0.000
V35            0.000
V36            0.000
V37            0.000
V38            0.000
V39            0.000
V40            0.000
V41            0.000
V42            0.000
V43            0.000
V44            0.000
V45            0.000
V46            0.000
V47            0.000
V48            0.000
V49            0.000
V50            0.000
V51            0.000
V52            0.000
V53            0.000
V54            0.000
V55            0.000
V56            0.000
V57            0.000
V58            0.000
V59            0.000
V60            0.000
V61            0.000
V62            0.000
V63            0.000
V64            0.000
V65            0.000
V66            0.000
V67            0.000
V68            0.000
V69            0.000
V70            0.000
V71            0.000
V72            0.000
V73            0.000
V74            0.000
V75            0.000
V76            0.000
V77            0.000
V78            0.000
V79            0.000
V80            0.000
V81            0.000
V82            0.000
V83            0.000
V84            0.000
V85            0.000
V86            0.000
V87            0.000
V88            0.000
V89            0.000
V90            0.000
V91            0.000
V92            0.000
V93            0.000
V94            0.000
V95            0.000
V96            0.000
V97            0.000
V98            0.000
V99            0.000
V100           0.000

Residual standard error: 1.061
Multiple R-squared:  0.7991
Adjusted R-squared:  0.7861
Joint significance test:
 the sup score statistic for joint significance test is 11.64 with a p-value of 0.004



OLS vs. (Post-)Lasso

We replicate the analysis of Notebook Overfitting of OLS and value of training vs. test sample and gradually add covariates to check how out-of-sample performance develops.

# Container of the results
results_ols = results_lasso = results_plasso = results_rlasso = matrix(NA,p-1,4)
colnames(results_ols) = colnames(results_lasso) = 
                        colnames(results_plasso) = 
                        colnames(results_rlasso) = c("Obs MSE train","Obs MSE test",
                                                    "Oracle MSE train","Oracle MSE test")

# Loop that gradually adds variables (start with 2, otherwise glmnet crashes)
for (i in 2:p) {
  # OLS
  temp_ols = lm(y_tr ~ x_tr[,2:(i+1)])
  temp_yhat_tr = predict(temp_ols)
  temp_yhat_te = x_te[,1:(i+1)] %*% temp_ols$coefficients
  # Calculate the observable MSEs in training and test sample
  results_ols[i-1,1] = mean((y_tr - temp_yhat_tr)^2) # in-sample MSE
  results_ols[i-1,2] = mean((y_te - temp_yhat_te)^2) # out-of-sample MSE
  # Calculate the oracle MSEs that are only observables b/c we know the true CEF
  results_ols[i-1,3] = var(y_tr - cfe_tr) + mean((cfe_tr - temp_yhat_tr)^2)
  results_ols[i-1,4] = var(y_te - cfe_te) + mean((cfe_te - temp_yhat_te)^2)
  
  # Lasso
  temp_lasso = cv.glmnet(x_tr[,2:(i+1)],y_tr)
  temp_yhat_tr = predict(temp_lasso,newx=x_tr[,2:(i+1)])
  temp_yhat_te = predict(temp_lasso,newx=x_te[,2:(i+1)])
  # Calculate the observable MSEs in training and test sample
  results_lasso[i-1,1] = mean((y_tr - temp_yhat_tr)^2) # in-sample MSE
  results_lasso[i-1,2] = mean((y_te - temp_yhat_te)^2) # out-of-sample MSE
  # Calculate the oracle MSEs that are only observables b/c we know the true CEF
  results_lasso[i-1,3] = var(y_tr - cfe_tr) + mean((cfe_tr - temp_yhat_tr)^2)
  results_lasso[i-1,4] = var(y_te - cfe_te) + mean((cfe_te - temp_yhat_te)^2)
  
  # plasso
  temp_plasso = cv.plasso(x_tr[,2:(i+1)],y_tr)
  temp_yhat_tr = predict(temp_plasso,newx=x_tr[,2:(i+1)])$plasso
  temp_yhat_te = predict(temp_plasso,newx=x_te[,2:(i+1)])$plasso
  # Calculate the observable MSEs in training and test sample
  results_plasso[i-1,1] = mean((y_tr - temp_yhat_tr)^2) # in-sample MSE
  results_plasso[i-1,2] = mean((y_te - temp_yhat_te)^2) # out-of-sample MSE
  # Calculate the oracle MSEs that are only observables b/c we know the true CEF
  results_plasso[i-1,3] = var(y_tr - cfe_tr) + mean((cfe_tr - temp_yhat_tr)^2)
  results_plasso[i-1,4] = var(y_te - cfe_te) + mean((cfe_te - temp_yhat_te)^2)
  
  # rlasso
  temp_rlasso = rlasso(x_tr[,2:(i+1)],y_tr)
  temp_yhat_tr = predict(temp_rlasso,newdata=x_tr[,2:(i+1)])
  temp_yhat_te = predict(temp_rlasso,newdata=x_te[,2:(i+1)])
  # Calculate the observable MSEs in training and test sample
  results_rlasso[i-1,1] = mean((y_tr - temp_yhat_tr)^2) # in-sample MSE
  results_rlasso[i-1,2] = mean((y_te - temp_yhat_te)^2) # out-of-sample MSE
  # Calculate the oracle MSEs that are only observables b/c we know the true CEF
  results_rlasso[i-1,3] = var(y_tr - cfe_tr) + mean((cfe_tr - temp_yhat_tr)^2)
  results_rlasso[i-1,4] = var(y_te - cfe_te) + mean((cfe_te - temp_yhat_te)^2)
}

Again, OLS explodes but the Lassos show a stable prediction performance regardless of the number of noise variables.

df = data.frame(Estimator = c(rep("OLS",p-1),rep("Lasso",p-1),rep("Post-Lasso CV",p-1),rep("Post-Lasso hdm",p-1)),
                Number.of.variables = c(2:p,2:p,2:p,2:p),
                Obs.MSE.test = c(results_ols[,2], results_lasso[,2], results_plasso[,2], results_rlasso[,2]),
                Oracle.MSE.test = c(results_ols[,4], results_lasso[,4], results_plasso[,4],results_plasso[,4]))

ggplot(subset(df), aes(x=Number.of.variables,y=Obs.MSE.test,colour=Estimator)) + geom_line(size=1)
Warnung: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
Please use `linewidth` instead.

ggplot(subset(df,df$Number.of.variables<70), aes(x=Number.of.variables,y=Obs.MSE.test,colour=Estimator))     + geom_line(size=1)  + geom_hline(yintercept = 0)

It seems that Post-Lasso has a slight advantage over plain Lasso. However, this is DGP dependent and could be flipped in other settings.



Take-away

  • The penalization of coefficients ensures good out-of-sample performance \(\Rightarrow\) Lasso works



Suggestions to play with the toy model

Feel free to play around with the code. This is useful to sharpen and challenge your understanding of the methods. Think about the consequences of a modifications before you run it and check whether the results are in line with your expectation. Some suggestions:

  • Modify DGP (change betas, change level of noise, introduce correlation between covariates, …)

  • Modify seed

  • Change training and test sample size

  • Investigate the 1SE rule for cross-validated Lasso

LS0tDQp0aXRsZTogIlN1cGVydmlzZWQgTUw6IExhc3NvIHNhdmVzIHRoZSBqb2Igb2YgT0xTIg0KYXV0aG9yOiAiTWljaGFlbCBLbmF1cyINCmRhdGU6ICJgciBmb3JtYXQoU3lzLnRpbWUoKSwgJyVtLyV5JylgIg0Kb3V0cHV0OiANCiAgaHRtbF9ub3RlYm9vazoNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCiAgICBjb2RlX2ZvbGRpbmc6IHNob3cNCi0tLQ0KDQo8YnI+DQoNCkdvYWxzOg0KDQotIElsbHVzdHJhdGUgaG93IExhc3NvIHdvcmtzDQoNCi0gSWxsdXN0cmF0ZSB0aGF0IExhc3NvIGRvZXMgbm90IG92ZXJmaXQNCg0KPGJyPg0KDQojIyBEYXRhIGdlbmVyYXRpbmcgcHJvY2Vzcw0KDQpXZSB1c2UgdGhlIHNhbWUgREdQIGFzIGluIHRoZSBub3RlYm9vayBbT3ZlcmZpdHRpbmcgb2YgT0xTIGFuZCB2YWx1ZSBvZiB0cmFpbmluZyB2cy4gdGVzdCBzYW1wbGVdKGh0dHBzOi8vbWNrbmF1cy5naXRodWIuaW8vYXNzZXRzL25vdGVib29rcy9TTkIvU05CX09MU19pbl92c19vdXRfb2Zfc2FtcGxlLm5iLmh0bWwpOg0KDQotICRwXGdlcTEwJCBpbmRlcGVuZGVudCBhbmQgc3RhbmRhcmQgbm9ybWFsIGNvdmFyaWF0ZXM6ICRYIFxzaW0gTigwLElfcCkkLCB3aGVyZSAkSV9wJCBpcyB0aGUgJHAkLWRpbWVuc2lvbmFsIGlkZW50aXR5IG1hdHJpeA0KDQotIFRoZSBvdXRjb21lIG1vZGVsIGlzICRZID0gXHVuZGVyYnJhY2V7XGJldGFfMCArIFxiZXRhXzEgWF8xICsgLi4uICsgXGJldGFfezEwfSBYX3sxMH19X3tcdGV4dHtjb25kaXRpb25hbCBleHBlY3RhdGlvbiBmdW5jdGlvbiB9bShYKX0gKyBlJCwgd2hlcmUgJFhfaiQgaXMgdGhlICRqJC10aCBjb2x1bW4gb2YgJFgkIGFuZCAkZSBcc2ltIE4oMCwxKSQNCg0KLSBXZSBjb25zaWRlciB0aGUgZm9sbG93aW5nIHBhcmFtZXRlcnMgJFxiZXRhXzAgPSAwJCwgJFxiZXRhXzEgPSAxJCwgJFxiZXRhXzIgPSAwLjkkLCAuLi4sICRcYmV0YV85ID0gMC4yJCwgJFxiZXRhX3sxMH0gPSAwLjEkIHN1Y2ggdGhhdCB0aGUgZmlyc3QgMTAgdmFyaWFibGVzIGhhdmUgYSBkZWNyZWFzaW5nIGltcGFjdCBvbiB0aGUgb3V0Y29tZSBhbmQgYW55IGZ1cnRoZXIgdmFyaWFibGVzIGFyZSBqdXN0IGlycmVsZXZhbnQgbm9pc2Ugd2l0aCBubyB0cnVlIHByZWRpY3RpdmUgcG93ZXINCg0KV2Ugc2V0ICRwPTk5JCBhbmQgZHJhdyBhIHRyYWluaW5nIHNhbXBsZSB3aXRoICROX3t0cn09MTAwJCBhbmQgYSB0ZXN0IHNhbXBsZSB3aXRoICROX3t0ZX09MTAsMDAwJA0KDQpgYGB7ciBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nID0gRkFMU0V9DQojIExvYWQgdGhlIHBhY2thZ2VzIHJlcXVpcmVkIGZvciBsYXRlcg0KaWYgKCFyZXF1aXJlKCJ0aWR5dmVyc2UiKSkgaW5zdGFsbC5wYWNrYWdlcygidGlkeXZlcnNlIiwgZGVwZW5kZW5jaWVzID0gVFJVRSk7IGxpYnJhcnkodGlkeXZlcnNlKQ0KaWYgKCFyZXF1aXJlKCJnbG1uZXQiKSkgaW5zdGFsbC5wYWNrYWdlcygiZ2xtbmV0IiwgZGVwZW5kZW5jaWVzID0gVFJVRSk7IGxpYnJhcnkoZ2xtbmV0KQ0KaWYgKCFyZXF1aXJlKCJoZG0iKSkgaW5zdGFsbC5wYWNrYWdlcygiaGRtIiwgZGVwZW5kZW5jaWVzID0gVFJVRSk7IGxpYnJhcnkoaGRtKQ0KaWYgKCFyZXF1aXJlKCJwbGFzc28iKSkgaW5zdGFsbC5wYWNrYWdlcygicGxhc3NvIiwgZGVwZW5kZW5jaWVzID0gVFJVRSk7IGxpYnJhcnkocGxhc3NvKQ0KDQpzZXQuc2VlZCgxMjM0KSAjIEZvciByZXBsaWNhYmlsaXR5DQoNCiMgRGVmaW5lIHRoZSBpbXBvcnRhbnQgcGFyYW1ldGVycw0Kbl90ciA9IDEwMA0Kbl90ZSA9IDEwMDAwDQpwID0gOTkNCmJldGEgPSBjKDAsc2VxKDEsMC4xLC0wLjEpLHJlcCgwLHAtMTApKQ0KDQojIENvbWJpbmUgY29uc3RhbnQgYW5kIHJhbmRvbWx5IGRyYXduIGNvdmFyaWF0ZXMNCnhfdHIgPSBjYmluZChyZXAoMSxuX3RyKSxtYXRyaXgocm5vcm0obl90cipwKSxuY29sPXApKQ0KeF90ZSA9IGNiaW5kKHJlcCgxLG5fdGUpLG1hdHJpeChybm9ybShuX3RlKnApLG5jb2w9cCkpDQoNCiMgQ3JlYXRlIHRoZSBDRUYgdXNpbmcgbWF0cml4IG11bHRpcGxpY2F0aW9uIGZvciBjb21wYWN0bmVzcw0KY2ZlX3RyID0geF90ciAlKiUgYmV0YQ0KY2ZlX3RlID0geF90ZSAlKiUgYmV0YQ0KDQojIENyZWF0ZSB0aGUgIm9ic2VydmVkIiBvdXRjb21lcyBieSBhZGRpbmcgbm9pc2UNCnlfdHIgPSBjZmVfdHIgKyBybm9ybShuX3RyLDAsMSkNCnlfdGUgPSBjZmVfdGUgKyBybm9ybShuX3RlLDAsMSkNCmBgYA0KDQo8YnI+DQoNCiMjIExhc3NvIGF0IHdvcmsNCg0KV2UgcnVuIExhc3NvIHdpdGggYWxsIDk5IHJlZ3Jlc3NvcnMuIFRoZSBgZ2xtbmV0YCBjb21tYW5kIGNhbGN1bGF0ZXMgdGhlIHRyYWplY3RvcnkgZnJvbSBhbiBlbXB0eSBtb2RlbCB0byBhIG5lYXJseSB1bnBlbmFsaXplZCBtb2RlbCAocmVhZCBmcm9tIHJpZ2h0IHRvIGxlZnQpOg0KDQpgYGB7cn0NCiMgQXMgbW9zdCBvZiB0aGUgY29tbWFuZHMgdGhhdCB3ZSB1c2UgZnJvbSBub3cgb24sIA0KIyBnbG1uZXQgdGFrZXMgaW5wdXRzIGluIG1hdHJpeCBmb3JtIGFuZCBub3QgYXMgZm9ybXVsYXMNCmxhc3NvID0gZ2xtbmV0KHhfdHIseV90cikNCnBsb3QobGFzc28sIHh2YXIgPSAibGFtYmRhIikNCmBgYA0KDQpJbiB0aGUgYmVnaW5uaW5nIHRoZSBjb2VmZmljaWVudHMgb2YgdGhlIHJlYWxseSByZWxldmFudCB2YXJpYWJsZXMgYXJlIGJ1aWxkIHVwIGFuZCBhdCBzb21lIHBvaW50IHRoZSBleGNlc3NpdmUgb3ZlcmZpdHRpbmcgc3RhcnRzLiANCg0KKipDcm9zcy12YWxpZGF0aW9uKiogaGVscHMgdG8gZmlndXJlIG91dCB0aGUgc3dlZXQgc3BvdCB0aGF0IGVuc3VyZXMgYSBnb29kIG91dC1vZi1zYW1wbGUgcGVyZm9ybWFuY2UuIFRoZSBgY3YuZ2xtbmV0YCBjb21tYW5kIHVzZXMgYnkgZGVmYXVsdCAxMC1mb2xkIGNyb3NzLXZhbGlkYXRpb24gbGVhZGluZyB0byB0aGUgZm9sbG93aW5nIHJlc3VsdDoNCg0KYGBge3J9DQpjdl9sYXNzbyA9IGN2LmdsbW5ldCh4X3RyLHlfdHIpDQpwbG90KGN2X2xhc3NvKQ0KYGBgDQoNCldlIG9ic2VydmUgdGhhdCB0aGUgY3Jvc3MtdmFsaWRhdGVkIE1TRSBkZWNyZWFzZXMgYXMgdGhlIHJlbGV2YW50IHZhcmlhYmxlcyBhcmUgc2VsZWN0ZWQgYW5kIGJ1aWxkIHVwLiBBdCBzb21lIHBvaW50IHRoZSByZWR1Y2VkIHBlbmFsaXphdGlvbiBsZWFkcyB0byBzZWxlY3Rpb24gb2Ygbm9pc2UgdmFyaWFibGVzIGFuZCB0aGUgY3Jvc3MtdmFsaWRhdGVkIE1TRSBkZXRlcmlvcmF0ZWQgYWdhaW4uIFdlIHNlbGVjdCB0aGUgcGVuYWx0eSB0ZXJtIHdoZXJlIHRoZSBjdXJ2ZSBpbmRpY2F0ZXMgdGhlIGxvd2VzdCBNU0UgKGFsdGVybmF0aXZlIGlzIHRoZSAxU0UgcnVsZSBpbmRpY2F0ZWQgYnkgdGhlIHJpZ2h0IGRhc2hlZCBsaW5lLCBidXQgd2UgZG9uJ3QgZ28gaW50byB0aGUgZGV0YWlscyBoZXJlKS4NCg0KPGJyPg0KDQojIyBQb3N0LUxhc3NvIGF0IHdvcmsNCg0KIyMjIENyb3NzLXZhbGlkYXRlZA0KDQpDcm9zcy12YWxpZGF0aW9uIGZvciBQb3N0LUxhc3NvIGlzIGltcGxlbWVudGVkIGluIHRoZSBgcGxhc3NvYCBwYWNrYWdlOg0KDQpgYGB7cn0NCiMgUHJvdmlkZSB0aGUgY292IG1hdHJpeCB3L28gdGhlIGNvbnN0YW50IHhfdHJbLC0xXQ0KIyBJbmNyZWFzaW5nIGxhbWJkYS5taW4ucmF0aW8gZW5zdXJlcyB0aGF0IExhc3NvIGRvZXMgbm90IG92ZXJmaXQgdG9vIGhlYXZpbHkgYW5kIHJlZHVjZXMgcnVubmluZyB0aW1lDQpwb3N0X2xhc3NvID0gcGxhc3NvKHhfdHJbLC0xXSx5X3RyLGxhbWJkYS5taW4ucmF0aW89MC4wMSkNCnBsb3QocG9zdF9sYXNzbyx4dmFyID0gImxhbWJkYSIpDQpgYGANCg0KV2hpbGUgTGFzc28gYnVpbGRzIHVwIGNvZWZmaWNpZW50cyBncmFkdWFsbHksIFBvc3QtTGFzc28gZ2l2ZXMgdGhlIGZ1bGwgT0xTIGNvZWZmaWNpZW50IGFzIHNvb24gYXMgdGhlIHZhcmlhYmxlIGlzIHNlbGVjdGVkICRcUmlnaHRhcnJvdyQgdGhlIGNvZWZmaWNpZW50cyBwYXRocyBvZiBQb3N0LUxhc3NvIGFyZSB1c3VhbGx5IG5vdCBzbW9vdGguDQoNClRoaXMgaXMgYWxzbyBvYnNlcnZhYmxlIGluIHRoZSBjcm9zcy12YWxpZGF0aW9uIGN1cnZlOg0KDQoNCmBgYHtyfQ0KY3ZfcGxhc3NvID0gY3YucGxhc3NvKHhfdHIseV90cixsYW1iZGEubWluLnJhdGlvPTAuMDEpDQpwbG90KGN2X3BsYXNzbywgbGVnZW5kX3Bvcz0iYm90dG9tbGVmdCIpDQpgYGANCg0KVGhlIGNvbXBhcmlzb24gb2YgTGFzc28gYW5kIFBvc3QtTGFzc28gY3Jvc3MtdmFsaWRhdGlvbiBpcyBpbnN0cnVjdGl2ZSB0byB1bmRlcnN0YW5kIHNvbWUgZGlmZmVyZW5jZXM6DQoNCi0gVGhlIFBvc3QtTGFzc28gY3Jvc3MtdmFsaWRhdGlvbiBjdXJ2ZSBpcyBtb3JlIGJ1bXB5IGFuZCBoYXMgZmxhdCByZWdpb25zIGNvbXBhcmVkIHRvIHRoZSBMYXNzbyBjdXJ2ZS4gUG9zdC1MYXNzbyBnaXZlcyB0aGUgZnVsbCBPTFMgY29lZmZpY2llbnQgYXMgc29vbiBhcyB0aGUgdmFyaWFibGUgaXMgc2VsZWN0ZWQuIFRoaXMgZXhwbGFpbnMgdGhlIGZsYXQgcmVnaW9ucyBvZiB0aGUgUG9zdC1MYXNzbyBjdXJ2ZSB3aGVyZSBubyBhZGRpdGlvbmFsIHZhcmlhYmxlIGlzIHNlbGVjdGVkIGFuZCB0aHVzIE1TRSBzdGF5cyBjb25zdGFudC4NCg0KLSBQb3N0LUxhc3NvIHVzdWFsbHkgcHJvZHVjZXMgc3BhcnNlciBjcm9zcy12YWxpZGF0ZWQgbW9kZWxzIGJlY2F1c2UgdGhlIHVuc2hydW5rZW4gY29lZmZpY2llbnRzIGRlbGl2ZXIgbW9yZSBleHBsYW5hdG9yeSBwb3dlciB3aXRoIGEgc21hbGxlciBzZXQgb2YgdmFyaWFibGVzLg0KDQo8YnI+DQoNCiMjIyBGYXN0IGltcGxlbWVudGF0aW9uIHdpdGggYGhkbWAgDQoNCldoaWxlIHRoZSBjcm9zcy12YWxpZGF0ZWQgUG9zdC1MYXNzbyBpcyBpbnN0cnVjdGl2ZSwgaXQgaXMgbm90IGZhc3QgYW5kIHRoZSBgcmxhc3NvYCBmdW5jdGlvbiBvZiB0aGUgYGhkbWAgcGFja2FnZSBpbXBsZW1lbnRzIGEgZmFzdGVyIHdheSB0byBjaG9vc2UgdGhlIHBlbmFsdHkgcGFyYW1ldGVyIGluIGEgZGF0YS1kcml2ZW4gd2F5Og0KDQpgYGB7cn0NCnBvc3RfaGRtID0gcmxhc3NvKHhfdHIseV90cikNCnN1bW1hcnkocG9zdF9oZG0pDQpgYGANCg0KDQo8YnI+DQo8YnI+DQoNCiMjIE9MUyB2cy4gKFBvc3QtKUxhc3NvDQoNCldlIHJlcGxpY2F0ZSB0aGUgYW5hbHlzaXMgb2YgTm90ZWJvb2sgW092ZXJmaXR0aW5nIG9mIE9MUyBhbmQgdmFsdWUgb2YgdHJhaW5pbmcgdnMuIHRlc3Qgc2FtcGxlXShodHRwczovL21ja25hdXMuZ2l0aHViLmlvL2Fzc2V0cy9ub3RlYm9va3MvU05CL1NOQl9PTFNfaW5fdnNfb3V0X29mX3NhbXBsZS5uYi5odG1sKSBhbmQgZ3JhZHVhbGx5IGFkZCBjb3ZhcmlhdGVzIHRvIGNoZWNrIGhvdyBvdXQtb2Ytc2FtcGxlIHBlcmZvcm1hbmNlIGRldmVsb3BzLg0KDQpgYGB7cn0NCiMgQ29udGFpbmVyIG9mIHRoZSByZXN1bHRzDQpyZXN1bHRzX29scyA9IHJlc3VsdHNfbGFzc28gPSByZXN1bHRzX3BsYXNzbyA9IHJlc3VsdHNfcmxhc3NvID0gbWF0cml4KE5BLHAtMSw0KQ0KY29sbmFtZXMocmVzdWx0c19vbHMpID0gY29sbmFtZXMocmVzdWx0c19sYXNzbykgPSANCiAgICAgICAgICAgICAgICAgICAgICAgIGNvbG5hbWVzKHJlc3VsdHNfcGxhc3NvKSA9IA0KICAgICAgICAgICAgICAgICAgICAgICAgY29sbmFtZXMocmVzdWx0c19ybGFzc28pID0gYygiT2JzIE1TRSB0cmFpbiIsIk9icyBNU0UgdGVzdCIsDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIk9yYWNsZSBNU0UgdHJhaW4iLCJPcmFjbGUgTVNFIHRlc3QiKQ0KDQojIExvb3AgdGhhdCBncmFkdWFsbHkgYWRkcyB2YXJpYWJsZXMgKHN0YXJ0IHdpdGggMiwgb3RoZXJ3aXNlIGdsbW5ldCBjcmFzaGVzKQ0KZm9yIChpIGluIDI6cCkgew0KICAjIE9MUw0KICB0ZW1wX29scyA9IGxtKHlfdHIgfiB4X3RyWywyOihpKzEpXSkNCiAgdGVtcF95aGF0X3RyID0gcHJlZGljdCh0ZW1wX29scykNCiAgdGVtcF95aGF0X3RlID0geF90ZVssMTooaSsxKV0gJSolIHRlbXBfb2xzJGNvZWZmaWNpZW50cw0KICAjIENhbGN1bGF0ZSB0aGUgb2JzZXJ2YWJsZSBNU0VzIGluIHRyYWluaW5nIGFuZCB0ZXN0IHNhbXBsZQ0KICByZXN1bHRzX29sc1tpLTEsMV0gPSBtZWFuKCh5X3RyIC0gdGVtcF95aGF0X3RyKV4yKSAjIGluLXNhbXBsZSBNU0UNCiAgcmVzdWx0c19vbHNbaS0xLDJdID0gbWVhbigoeV90ZSAtIHRlbXBfeWhhdF90ZSleMikgIyBvdXQtb2Ytc2FtcGxlIE1TRQ0KICAjIENhbGN1bGF0ZSB0aGUgb3JhY2xlIE1TRXMgdGhhdCBhcmUgb25seSBvYnNlcnZhYmxlcyBiL2Mgd2Uga25vdyB0aGUgdHJ1ZSBDRUYNCiAgcmVzdWx0c19vbHNbaS0xLDNdID0gdmFyKHlfdHIgLSBjZmVfdHIpICsgbWVhbigoY2ZlX3RyIC0gdGVtcF95aGF0X3RyKV4yKQ0KICByZXN1bHRzX29sc1tpLTEsNF0gPSB2YXIoeV90ZSAtIGNmZV90ZSkgKyBtZWFuKChjZmVfdGUgLSB0ZW1wX3loYXRfdGUpXjIpDQogIA0KICAjIExhc3NvDQogIHRlbXBfbGFzc28gPSBjdi5nbG1uZXQoeF90clssMjooaSsxKV0seV90cikNCiAgdGVtcF95aGF0X3RyID0gcHJlZGljdCh0ZW1wX2xhc3NvLG5ld3g9eF90clssMjooaSsxKV0pDQogIHRlbXBfeWhhdF90ZSA9IHByZWRpY3QodGVtcF9sYXNzbyxuZXd4PXhfdGVbLDI6KGkrMSldKQ0KICAjIENhbGN1bGF0ZSB0aGUgb2JzZXJ2YWJsZSBNU0VzIGluIHRyYWluaW5nIGFuZCB0ZXN0IHNhbXBsZQ0KICByZXN1bHRzX2xhc3NvW2ktMSwxXSA9IG1lYW4oKHlfdHIgLSB0ZW1wX3loYXRfdHIpXjIpICMgaW4tc2FtcGxlIE1TRQ0KICByZXN1bHRzX2xhc3NvW2ktMSwyXSA9IG1lYW4oKHlfdGUgLSB0ZW1wX3loYXRfdGUpXjIpICMgb3V0LW9mLXNhbXBsZSBNU0UNCiAgIyBDYWxjdWxhdGUgdGhlIG9yYWNsZSBNU0VzIHRoYXQgYXJlIG9ubHkgb2JzZXJ2YWJsZXMgYi9jIHdlIGtub3cgdGhlIHRydWUgQ0VGDQogIHJlc3VsdHNfbGFzc29baS0xLDNdID0gdmFyKHlfdHIgLSBjZmVfdHIpICsgbWVhbigoY2ZlX3RyIC0gdGVtcF95aGF0X3RyKV4yKQ0KICByZXN1bHRzX2xhc3NvW2ktMSw0XSA9IHZhcih5X3RlIC0gY2ZlX3RlKSArIG1lYW4oKGNmZV90ZSAtIHRlbXBfeWhhdF90ZSleMikNCiAgDQogICMgcGxhc3NvDQogIHRlbXBfcGxhc3NvID0gY3YucGxhc3NvKHhfdHJbLDI6KGkrMSldLHlfdHIpDQogIHRlbXBfeWhhdF90ciA9IHByZWRpY3QodGVtcF9wbGFzc28sbmV3eD14X3RyWywyOihpKzEpXSkkcGxhc3NvDQogIHRlbXBfeWhhdF90ZSA9IHByZWRpY3QodGVtcF9wbGFzc28sbmV3eD14X3RlWywyOihpKzEpXSkkcGxhc3NvDQogICMgQ2FsY3VsYXRlIHRoZSBvYnNlcnZhYmxlIE1TRXMgaW4gdHJhaW5pbmcgYW5kIHRlc3Qgc2FtcGxlDQogIHJlc3VsdHNfcGxhc3NvW2ktMSwxXSA9IG1lYW4oKHlfdHIgLSB0ZW1wX3loYXRfdHIpXjIpICMgaW4tc2FtcGxlIE1TRQ0KICByZXN1bHRzX3BsYXNzb1tpLTEsMl0gPSBtZWFuKCh5X3RlIC0gdGVtcF95aGF0X3RlKV4yKSAjIG91dC1vZi1zYW1wbGUgTVNFDQogICMgQ2FsY3VsYXRlIHRoZSBvcmFjbGUgTVNFcyB0aGF0IGFyZSBvbmx5IG9ic2VydmFibGVzIGIvYyB3ZSBrbm93IHRoZSB0cnVlIENFRg0KICByZXN1bHRzX3BsYXNzb1tpLTEsM10gPSB2YXIoeV90ciAtIGNmZV90cikgKyBtZWFuKChjZmVfdHIgLSB0ZW1wX3loYXRfdHIpXjIpDQogIHJlc3VsdHNfcGxhc3NvW2ktMSw0XSA9IHZhcih5X3RlIC0gY2ZlX3RlKSArIG1lYW4oKGNmZV90ZSAtIHRlbXBfeWhhdF90ZSleMikNCiAgDQogICMgcmxhc3NvDQogIHRlbXBfcmxhc3NvID0gcmxhc3NvKHhfdHJbLDI6KGkrMSldLHlfdHIpDQogIHRlbXBfeWhhdF90ciA9IHByZWRpY3QodGVtcF9ybGFzc28sbmV3ZGF0YT14X3RyWywyOihpKzEpXSkNCiAgdGVtcF95aGF0X3RlID0gcHJlZGljdCh0ZW1wX3JsYXNzbyxuZXdkYXRhPXhfdGVbLDI6KGkrMSldKQ0KICAjIENhbGN1bGF0ZSB0aGUgb2JzZXJ2YWJsZSBNU0VzIGluIHRyYWluaW5nIGFuZCB0ZXN0IHNhbXBsZQ0KICByZXN1bHRzX3JsYXNzb1tpLTEsMV0gPSBtZWFuKCh5X3RyIC0gdGVtcF95aGF0X3RyKV4yKSAjIGluLXNhbXBsZSBNU0UNCiAgcmVzdWx0c19ybGFzc29baS0xLDJdID0gbWVhbigoeV90ZSAtIHRlbXBfeWhhdF90ZSleMikgIyBvdXQtb2Ytc2FtcGxlIE1TRQ0KICAjIENhbGN1bGF0ZSB0aGUgb3JhY2xlIE1TRXMgdGhhdCBhcmUgb25seSBvYnNlcnZhYmxlcyBiL2Mgd2Uga25vdyB0aGUgdHJ1ZSBDRUYNCiAgcmVzdWx0c19ybGFzc29baS0xLDNdID0gdmFyKHlfdHIgLSBjZmVfdHIpICsgbWVhbigoY2ZlX3RyIC0gdGVtcF95aGF0X3RyKV4yKQ0KICByZXN1bHRzX3JsYXNzb1tpLTEsNF0gPSB2YXIoeV90ZSAtIGNmZV90ZSkgKyBtZWFuKChjZmVfdGUgLSB0ZW1wX3loYXRfdGUpXjIpDQp9DQpgYGANCg0KQWdhaW4sIE9MUyBleHBsb2RlcyBidXQgdGhlIExhc3NvcyBzaG93IGEgc3RhYmxlIHByZWRpY3Rpb24gcGVyZm9ybWFuY2UgcmVnYXJkbGVzcyBvZiB0aGUgbnVtYmVyIG9mIG5vaXNlIHZhcmlhYmxlcy4NCg0KYGBge3J9DQpkZiA9IGRhdGEuZnJhbWUoRXN0aW1hdG9yID0gYyhyZXAoIk9MUyIscC0xKSxyZXAoIkxhc3NvIixwLTEpLHJlcCgiUG9zdC1MYXNzbyBDViIscC0xKSxyZXAoIlBvc3QtTGFzc28gaGRtIixwLTEpKSwNCiAgICAgICAgICAgICAgICBOdW1iZXIub2YudmFyaWFibGVzID0gYygyOnAsMjpwLDI6cCwyOnApLA0KICAgICAgICAgICAgICAgIE9icy5NU0UudGVzdCA9IGMocmVzdWx0c19vbHNbLDJdLCByZXN1bHRzX2xhc3NvWywyXSwgcmVzdWx0c19wbGFzc29bLDJdLCByZXN1bHRzX3JsYXNzb1ssMl0pLA0KICAgICAgICAgICAgICAgIE9yYWNsZS5NU0UudGVzdCA9IGMocmVzdWx0c19vbHNbLDRdLCByZXN1bHRzX2xhc3NvWyw0XSwgcmVzdWx0c19wbGFzc29bLDRdLHJlc3VsdHNfcGxhc3NvWyw0XSkpDQoNCmdncGxvdChzdWJzZXQoZGYpLCBhZXMoeD1OdW1iZXIub2YudmFyaWFibGVzLHk9T2JzLk1TRS50ZXN0LGNvbG91cj1Fc3RpbWF0b3IpKSArIGdlb21fbGluZShzaXplPTEpDQoNCmdncGxvdChzdWJzZXQoZGYsZGYkTnVtYmVyLm9mLnZhcmlhYmxlczw3MCksIGFlcyh4PU51bWJlci5vZi52YXJpYWJsZXMseT1PYnMuTVNFLnRlc3QsY29sb3VyPUVzdGltYXRvcikpICAgICArIGdlb21fbGluZShzaXplPTEpICArIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDApDQpgYGANCg0KSXQgc2VlbXMgdGhhdCBQb3N0LUxhc3NvIGhhcyBhIHNsaWdodCBhZHZhbnRhZ2Ugb3ZlciBwbGFpbiBMYXNzby4gSG93ZXZlciwgdGhpcyBpcyBER1AgZGVwZW5kZW50IGFuZCBjb3VsZCBiZSBmbGlwcGVkIGluIG90aGVyIHNldHRpbmdzLg0KDQo8YnI+DQo8YnI+DQoNCiMjIyBUYWtlLWF3YXkNCiANCiAtIFRoZSBwZW5hbGl6YXRpb24gb2YgY29lZmZpY2llbnRzIGVuc3VyZXMgZ29vZCBvdXQtb2Ytc2FtcGxlIHBlcmZvcm1hbmNlICRcUmlnaHRhcnJvdyQgTGFzc28gd29ya3MNCiANCjxicj4NCjxicj4NCiANCiANCiMjIyBTdWdnZXN0aW9ucyB0byBwbGF5IHdpdGggdGhlIHRveSBtb2RlbA0KDQpGZWVsIGZyZWUgdG8gcGxheSBhcm91bmQgd2l0aCB0aGUgY29kZS4gVGhpcyBpcyB1c2VmdWwgdG8gc2hhcnBlbiBhbmQgY2hhbGxlbmdlIHlvdXIgdW5kZXJzdGFuZGluZyBvZiB0aGUgbWV0aG9kcy4gVGhpbmsgYWJvdXQgdGhlIGNvbnNlcXVlbmNlcyBvZiBhIG1vZGlmaWNhdGlvbnMgYmVmb3JlIHlvdSBydW4gaXQgYW5kIGNoZWNrIHdoZXRoZXIgdGhlIHJlc3VsdHMgYXJlIGluIGxpbmUgd2l0aCB5b3VyIGV4cGVjdGF0aW9uLiBTb21lIHN1Z2dlc3Rpb25zOg0KIA0KLSBNb2RpZnkgREdQIChjaGFuZ2UgYmV0YXMsIGNoYW5nZSBsZXZlbCBvZiBub2lzZSwgaW50cm9kdWNlIGNvcnJlbGF0aW9uIGJldHdlZW4gY292YXJpYXRlcywgLi4uKQ0KDQotIE1vZGlmeSBzZWVkDQoNCi0gQ2hhbmdlIHRyYWluaW5nIGFuZCB0ZXN0IHNhbXBsZSBzaXplDQoNCi0gSW52ZXN0aWdhdGUgdGhlIDFTRSBydWxlIGZvciBjcm9zcy12YWxpZGF0ZWQgTGFzc28NCiANCg0KIA==