Data and quick exploration

We perform PCA on the USArrests data set, which is part of the base R package. This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas.

The rows of the data set contain the 50 states, in alphabetical order.

data(USArrests)
X <- USArrests
states <- row.names(X)
states
 [1] "Alabama"        "Alaska"         "Arizona"        "Arkansas"       "California"     "Colorado"       "Connecticut"   
 [8] "Delaware"       "Florida"        "Georgia"        "Hawaii"         "Idaho"          "Illinois"       "Indiana"       
[15] "Iowa"           "Kansas"         "Kentucky"       "Louisiana"      "Maine"          "Maryland"       "Massachusetts" 
[22] "Michigan"       "Minnesota"      "Mississippi"    "Missouri"       "Montana"        "Nebraska"       "Nevada"        
[29] "New Hampshire"  "New Jersey"     "New Mexico"     "New York"       "North Carolina" "North Dakota"   "Ohio"          
[36] "Oklahoma"       "Oregon"         "Pennsylvania"   "Rhode Island"   "South Carolina" "South Dakota"   "Tennessee"     
[43] "Texas"          "Utah"           "Vermont"        "Virginia"       "Washington"     "West Virginia"  "Wisconsin"     
[50] "Wyoming"       

The columns of the data set contain the four variables.

summary(X)
     Murder         Assault       UrbanPop         Rape     
 Min.   : 0.80   Min.   : 45   Min.   :32.0   Min.   : 7.3  
 1st Qu.: 4.08   1st Qu.:109   1st Qu.:54.5   1st Qu.:15.1  
 Median : 7.25   Median :159   Median :66.0   Median :20.1  
 Mean   : 7.79   Mean   :171   Mean   :65.5   Mean   :21.2  
 3rd Qu.:11.25   3rd Qu.:249   3rd Qu.:77.8   3rd Qu.:26.2  
 Max.   :17.40   Max.   :337   Max.   :91.0   Max.   :46.0  

Visual exploration

We first briefly examine the data. Some variables are highly correlated.

pairs(X, pch = 19, col = "gray40", main = "USArrests: pairwise scatterplots")

The variables also have vastly different means and variances. We see that there are on average three times as many rapes as murders, and more than eight times as many assaults as rapes. The UrbanPop variable measures the percentage of the population in each state living in an urban area, which is not a comparable number to the number of rapes in each state per 100,000 individuals.

boxplot(X, col = c("#7FB3D5", "#F1948A", "#82E0AA", "#F7DC6F"),
        main = "Different variable scales")

PCA: no scaling vs scaling

We now perform principal components analysis using the prcomp() function, which is one of several functions in R that perform PCA.

By default, the prcomp() function centers the variables to have mean zero (we here add center = TRUE to explicitly confirm centering). By using the option scale = TRUE, we scale the variables to have standard deviation one. The output from prcomp() contains a number of useful quantities.

pca_cov <- prcomp(X, center = TRUE, scale. = FALSE)
pca_cor <- prcomp(X, center = TRUE, scale. = TRUE)

names(pca_cov)
[1] "sdev"     "rotation" "center"   "scale"    "x"       

The center and scale components correspond to the means and standard deviations of the variables that were used for scaling prior to implementing PCA.

pca_cor$center
  Murder  Assault UrbanPop     Rape 
   7.788  170.760   65.540   21.232 
pca_cor$scale
  Murder  Assault UrbanPop     Rape 
   4.356   83.338   14.475    9.366 

The rotation matrix provides the principal component loadings; each column of pca$rotation contains the corresponding principal component loading vector. Recall that the principal components are only unique up to a sign change.

pca_cor$rotation
             PC1     PC2     PC3      PC4
Murder   -0.5359 -0.4182  0.3412  0.64923
Assault  -0.5832 -0.1880  0.2681 -0.74341
UrbanPop -0.2782  0.8728  0.3780  0.13388
Rape     -0.5434  0.1673 -0.8178  0.08902

If we failed to scale the variables before performing PCA, then most of the principal components that we observed would be driven by the Assault variable, since it has by far the largest mean and variance. Thus, it is important to standardize the variables to have mean zero and standard deviation one before performing PCA.

library(ggplot2)
library(reshape2)
library(patchwork)

# choose number of PCs

# prepare function
make_df <- function(pca, label) {
  loadings <- pca$rotation
  df <- melt(loadings)
  colnames(df) <- c("Variable", "PC", "Loading")
  df$Method <- label
  df
}

df_cov <- make_df(pca_cov, "Covariance PCA")
df_cor <- make_df(pca_cor, "Correlation PCA")

df_all <- rbind(df_cov, df_cor)

# keep ordering
df_all$Variable <- factor(df_all$Variable,
                         levels = rownames(pca_cov$rotation))

# plot
p <- ggplot(df_all, aes(x = PC, y = Variable, fill = Loading)) +
  geom_tile() +
  scale_fill_gradient2(midpoint = 0) +
  facet_wrap(~Method, nrow = 1) +
  theme_minimal() +
  labs(title = "PCA Loadings: Covariance vs Correlation",
       x = "Principal Component",
       y = "Variable")

p

Scree plots (side-by-side)

The prcomp() function also outputs the standard deviation of each principal component. On the USArrests data set, we can access these standard deviations as follows:

pca_cor$sdev
[1] 1.5749 0.9949 0.5971 0.4164

The variance explained by each principal component is obtained by squaring these. The scree plot shows these variances. We see below that without scaling, most of the variability is explained by the first principal component, which is essentially the Aasult variable.

par(mfrow = c(1, 2))
screeplot(pca_cov, type = "l", main = "Scree: covariance PCA")
screeplot(pca_cor, type = "l", main = "Scree: correlation PCA")

Explained variance and cumulative PVE

pve_cov <- (pca_cov$sdev^2) / sum(pca_cov$sdev^2)
pve_cor <- (pca_cor$sdev^2) / sum(pca_cor$sdev^2)

pve_cor
[1] 0.62006 0.24744 0.08914 0.04336
par(mfrow = c(1, 2))

plot(cumsum(pve_cov), type = "b", pch = 19,
     xlab = "Principal Component",
     ylab = "Cumulative PVE",
     main = "Cumulative PVE (Covariance PCA)",
     ylim = c(0, 1))
abline(h = 0.8, col = "red", lty = 2)
abline(h = 0.9, col = "blue", lty = 2)

plot(cumsum(pve_cor), type = "b", pch = 19,
     xlab = "Principal Component",
     ylab = "Cumulative PVE",
     main = "Cumulative PVE (Correlation PCA)",
     ylim = c(0, 1))
abline(h = 0.8, col = "red", lty = 2)
abline(h = 0.9, col = "blue", lty = 2)

par(mfrow = c(1, 1))

After scaling, we see that the first principal component explains \(62.0 \%\) of the variance in the data, the next principal component explains \(24.7 \%\) of the variance, and so forth.

Number of PCs: Kaiser rule and parallel analysis (scaled PCA)

# Kaiser rule: keep eigenvalues > 1 for correlation-based PCA
screeplot(pca_cor, type = "l", main = "Scree: correlation PCA")
abline(h = 1, col = "red", lty = 2)

## Optional: annotate
text(2, 1.1, "Kaiser threshold = 1", col = "red")

# Parallel analysis
set.seed(24620)
B <- 500
n <- nrow(X)
p <- ncol(X)
eig <- pca_cor$sdev^2

rand_eigs <- replicate(B, {
  X_rand <- matrix(rnorm(n * p), nrow = n, ncol = p)
  eigen(cor(X_rand), symmetric = TRUE, only.values = TRUE)$values
})

rand_mean <- rowMeans(rand_eigs)
rand_95 <- apply(rand_eigs, 1, quantile, probs = 0.95)

pa_cutoff <- apply(rand_eigs, 1, quantile, probs = 0.95)

plot(eig, type = "b", pch = 19,
     xlab = "Principal Component",
     ylab = "Eigenvalue",
     main = "Parallel Analysis")
lines(rand_mean, type = "b", pch = 17, lty = 1, col = 2)
lines(rand_95, lty = 2, col = 4)

Loadings interpretation (scaled PCA)

round(pca_cor$rotation[, 1:2], 3)
            PC1    PC2
Murder   -0.536 -0.418
Assault  -0.583 -0.188
UrbanPop -0.278  0.873
Rape     -0.543  0.167
par(mfrow = c(1, 2))
barplot(pca_cor$rotation[, 1], las = 2, main = "Loadings: PC1", col = "steelblue")
barplot(pca_cor$rotation[, 2], las = 2, main = "Loadings: PC2", col = "tomato")

Score map (states in first two PCs)

The score matrix is stored in pca$x. The \(k\)th column is the \(k\)th principal component score vector.

scores <- pca_cor$x[, 1:2]
plot(scores, type = "n", xlab = "PC1", ylab = "PC2",
     main = "USArrests states projected to (PC1, PC2)")
text(scores, labels = rownames(X), cex = 0.65, col = "gray30")
abline(h = 0, v = 0, lty = 2, col = "gray70")

Score map + loadings

We can have a more informative visualization by adding the loading directions

biplot(pca_cor, scale = 0, cex = 0.7)

Orthogonality check

round(cov(pca_cor$x), 3)
     PC1  PC2   PC3   PC4
PC1 2.48 0.00 0.000 0.000
PC2 0.00 0.99 0.000 0.000
PC3 0.00 0.00 0.357 0.000
PC4 0.00 0.00 0.000 0.173

Wrap-up

LS0tCnRpdGxlOiAiTGVjdHVyZSAzIFBDQSBMaXZlIERlbW86IFVTQXJyZXN0cyIKYXV0aG9yOiAiU1RBVCAyNDYyMCA9IFNUQVQgMzI5MDAsIEZJTk0gMzQ3MDAiCmRhdGU6ICJTcHJpbmcgMjAyNiIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZmlnLndpZHRoID0gNywgZmlnLmhlaWdodCA9IDQuMiwgbWVzc2FnZSA9IEZBTFNFLCB3YXJuaW5nID0gRkFMU0UpCm9wdGlvbnMoZGlnaXRzID0gNCkKYGBgCgoKIyMgRGF0YSBhbmQgcXVpY2sgZXhwbG9yYXRpb24KCldlIHBlcmZvcm0gUENBIG9uIHRoZSBgVVNBcnJlc3RzYCBkYXRhIHNldCwgd2hpY2ggaXMgcGFydCBvZiB0aGUgYmFzZSBgUmAgcGFja2FnZS4gVGhpcyBkYXRhIHNldCBjb250YWlucyBzdGF0aXN0aWNzLCBpbiBhcnJlc3RzIHBlciAxMDAsMDAwIHJlc2lkZW50cyBmb3IgYXNzYXVsdCwgbXVyZGVyLCBhbmQgcmFwZSBpbiBlYWNoIG9mIHRoZSA1MCBVUyBzdGF0ZXMgaW4gMTk3My4gQWxzbyBnaXZlbiBpcyB0aGUgcGVyY2VudCBvZiB0aGUgcG9wdWxhdGlvbiBsaXZpbmcgaW4gdXJiYW4gYXJlYXMuCgpUaGUgcm93cyBvZiB0aGUgZGF0YSBzZXQgY29udGFpbiB0aGUgNTAgc3RhdGVzLCBpbiBhbHBoYWJldGljYWwgb3JkZXIuCgpgYGB7ciBjaHVuazF9CmRhdGEoVVNBcnJlc3RzKQpYIDwtIFVTQXJyZXN0cwpzdGF0ZXMgPC0gcm93Lm5hbWVzKFgpCnN0YXRlcwpgYGAKClRoZSBjb2x1bW5zIG9mIHRoZSBkYXRhIHNldCBjb250YWluIHRoZSBmb3VyIHZhcmlhYmxlcy4KCmBgYHtyIGNodW5rMn0Kc3VtbWFyeShYKQpgYGAKCiMjIFZpc3VhbCBleHBsb3JhdGlvbiAKCldlIGZpcnN0IGJyaWVmbHkgZXhhbWluZSB0aGUgZGF0YS4gU29tZSB2YXJpYWJsZXMgYXJlIGhpZ2hseSBjb3JyZWxhdGVkLgoKYGBge3J9CnBhaXJzKFgsIHBjaCA9IDE5LCBjb2wgPSAiZ3JheTQwIiwgbWFpbiA9ICJVU0FycmVzdHM6IHBhaXJ3aXNlIHNjYXR0ZXJwbG90cyIpCmBgYAoKVGhlIHZhcmlhYmxlcyBhbHNvIGhhdmUgdmFzdGx5IGRpZmZlcmVudCBtZWFucyBhbmQgdmFyaWFuY2VzLiBXZSBzZWUgdGhhdCB0aGVyZSBhcmUgb24gYXZlcmFnZSB0aHJlZSB0aW1lcyBhcyBtYW55IHJhcGVzIGFzIG11cmRlcnMsIGFuZCBtb3JlIHRoYW4gZWlnaHQgdGltZXMgYXMgbWFueSBhc3NhdWx0cyBhcyByYXBlcy4gVGhlIGBVcmJhblBvcGAgdmFyaWFibGUgbWVhc3VyZXMgdGhlIHBlcmNlbnRhZ2Ugb2YgdGhlIHBvcHVsYXRpb24gaW4gZWFjaCBzdGF0ZSBsaXZpbmcgaW4gYW4gdXJiYW4gYXJlYSwgd2hpY2ggaXMgbm90IGEgY29tcGFyYWJsZSBudW1iZXIgdG8gdGhlIG51bWJlciBvZiByYXBlcyBpbiBlYWNoIHN0YXRlIHBlciAxMDAsMDAwIGluZGl2aWR1YWxzLgpgYGB7cn0KYm94cGxvdChYLCBjb2wgPSBjKCIjN0ZCM0Q1IiwgIiNGMTk0OEEiLCAiIzgyRTBBQSIsICIjRjdEQzZGIiksCiAgICAgICAgbWFpbiA9ICJEaWZmZXJlbnQgdmFyaWFibGUgc2NhbGVzIikKYGBgCgojIyBQQ0E6IG5vIHNjYWxpbmcgdnMgc2NhbGluZwoKV2Ugbm93IHBlcmZvcm0gcHJpbmNpcGFsIGNvbXBvbmVudHMgYW5hbHlzaXMgdXNpbmcgdGhlIGBwcmNvbXAoKWAgZnVuY3Rpb24sIHdoaWNoIGlzIG9uZSBvZiBzZXZlcmFsIGZ1bmN0aW9ucyBpbiBgUmAgdGhhdCBwZXJmb3JtIFBDQS4KCkJ5IGRlZmF1bHQsIHRoZSBgcHJjb21wKClgIGZ1bmN0aW9uIGNlbnRlcnMgdGhlIHZhcmlhYmxlcyB0byBoYXZlIG1lYW4gemVybyAod2UgaGVyZSBhZGQgYGNlbnRlciA9IFRSVUVgIHRvIGV4cGxpY2l0bHkgY29uZmlybSBjZW50ZXJpbmcpLiBCeSB1c2luZyB0aGUgb3B0aW9uIGBzY2FsZSA9IFRSVUVgLCB3ZSBzY2FsZSB0aGUgdmFyaWFibGVzIHRvIGhhdmUgc3RhbmRhcmQgZGV2aWF0aW9uIG9uZS4gVGhlIG91dHB1dCBmcm9tIGBwcmNvbXAoKWAgY29udGFpbnMgYSBudW1iZXIgb2YgdXNlZnVsIHF1YW50aXRpZXMuCgoKYGBge3J9CnBjYV9jb3YgPC0gcHJjb21wKFgsIGNlbnRlciA9IFRSVUUsIHNjYWxlLiA9IEZBTFNFKQpwY2FfY29yIDwtIHByY29tcChYLCBjZW50ZXIgPSBUUlVFLCBzY2FsZS4gPSBUUlVFKQoKbmFtZXMocGNhX2NvdikKYGBgCgpUaGUgYGNlbnRlcmAgYW5kIGBzY2FsZWAgY29tcG9uZW50cyBjb3JyZXNwb25kIHRvIHRoZSBtZWFucyBhbmQgc3RhbmRhcmQgZGV2aWF0aW9ucyBvZiB0aGUgdmFyaWFibGVzIHRoYXQgd2VyZSB1c2VkIGZvciBzY2FsaW5nIHByaW9yIHRvIGltcGxlbWVudGluZyBQQ0EuCgpgYGB7cn0KcGNhX2NvciRjZW50ZXIKcGNhX2NvciRzY2FsZQpgYGAKClRoZSBgcm90YXRpb25gIG1hdHJpeCBwcm92aWRlcyB0aGUgcHJpbmNpcGFsIGNvbXBvbmVudCBsb2FkaW5nczsgZWFjaCBjb2x1bW4gb2YgYHBjYSRyb3RhdGlvbmAgY29udGFpbnMgdGhlIGNvcnJlc3BvbmRpbmcgcHJpbmNpcGFsIGNvbXBvbmVudCBsb2FkaW5nIHZlY3Rvci4gUmVjYWxsIHRoYXQgdGhlIHByaW5jaXBhbCBjb21wb25lbnRzIGFyZSBvbmx5IHVuaXF1ZSB1cCB0byBhIHNpZ24gY2hhbmdlLgoKYGBge3J9CnBjYV9jb3Ikcm90YXRpb24KYGBgCgpJZiB3ZSBmYWlsZWQgdG8gc2NhbGUgdGhlIHZhcmlhYmxlcyBiZWZvcmUgcGVyZm9ybWluZyBQQ0EsIHRoZW4gbW9zdCBvZiB0aGUgcHJpbmNpcGFsIGNvbXBvbmVudHMgdGhhdCB3ZSBvYnNlcnZlZCB3b3VsZCBiZSBkcml2ZW4gYnkgdGhlIGBBc3NhdWx0YCB2YXJpYWJsZSwgc2luY2UgaXQgaGFzIGJ5IGZhciB0aGUgbGFyZ2VzdCBtZWFuIGFuZCB2YXJpYW5jZS4KVGh1cywgaXQgaXMgaW1wb3J0YW50IHRvIHN0YW5kYXJkaXplIHRoZSB2YXJpYWJsZXMgdG8gaGF2ZSBtZWFuIHplcm8gYW5kIHN0YW5kYXJkIGRldmlhdGlvbiBvbmUgYmVmb3JlIHBlcmZvcm1pbmcgUENBLgoKYGBge3J9CmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeShyZXNoYXBlMikKbGlicmFyeShwYXRjaHdvcmspCgojIGNob29zZSBudW1iZXIgb2YgUENzCgojIHByZXBhcmUgZnVuY3Rpb24KbWFrZV9kZiA8LSBmdW5jdGlvbihwY2EsIGxhYmVsKSB7CiAgbG9hZGluZ3MgPC0gcGNhJHJvdGF0aW9uCiAgZGYgPC0gbWVsdChsb2FkaW5ncykKICBjb2xuYW1lcyhkZikgPC0gYygiVmFyaWFibGUiLCAiUEMiLCAiTG9hZGluZyIpCiAgZGYkTWV0aG9kIDwtIGxhYmVsCiAgZGYKfQoKZGZfY292IDwtIG1ha2VfZGYocGNhX2NvdiwgIkNvdmFyaWFuY2UgUENBIikKZGZfY29yIDwtIG1ha2VfZGYocGNhX2NvciwgIkNvcnJlbGF0aW9uIFBDQSIpCgpkZl9hbGwgPC0gcmJpbmQoZGZfY292LCBkZl9jb3IpCgojIGtlZXAgb3JkZXJpbmcKZGZfYWxsJFZhcmlhYmxlIDwtIGZhY3RvcihkZl9hbGwkVmFyaWFibGUsCiAgICAgICAgICAgICAgICAgICAgICAgICBsZXZlbHMgPSByb3duYW1lcyhwY2FfY292JHJvdGF0aW9uKSkKCiMgcGxvdApwIDwtIGdncGxvdChkZl9hbGwsIGFlcyh4ID0gUEMsIHkgPSBWYXJpYWJsZSwgZmlsbCA9IExvYWRpbmcpKSArCiAgZ2VvbV90aWxlKCkgKwogIHNjYWxlX2ZpbGxfZ3JhZGllbnQyKG1pZHBvaW50ID0gMCkgKwogIGZhY2V0X3dyYXAofk1ldGhvZCwgbnJvdyA9IDEpICsKICB0aGVtZV9taW5pbWFsKCkgKwogIGxhYnModGl0bGUgPSAiUENBIExvYWRpbmdzOiBDb3ZhcmlhbmNlIHZzIENvcnJlbGF0aW9uIiwKICAgICAgIHggPSAiUHJpbmNpcGFsIENvbXBvbmVudCIsCiAgICAgICB5ID0gIlZhcmlhYmxlIikKCnAKYGBgCgoKCgojIyBTY3JlZSBwbG90cyAoc2lkZS1ieS1zaWRlKQoKVGhlIGBwcmNvbXAoKWAgZnVuY3Rpb24gYWxzbyBvdXRwdXRzIHRoZSBzdGFuZGFyZCBkZXZpYXRpb24gb2YgZWFjaCBwcmluY2lwYWwgY29tcG9uZW50LiBPbiB0aGUgYFVTQXJyZXN0c2AgZGF0YSBzZXQsIHdlIGNhbiBhY2Nlc3MgdGhlc2Ugc3RhbmRhcmQgZGV2aWF0aW9ucyBhcyBmb2xsb3dzOgoKYGBge3J9CnBjYV9jb3Ikc2RldgpgYGAKCgpUaGUgdmFyaWFuY2UgZXhwbGFpbmVkIGJ5IGVhY2ggcHJpbmNpcGFsIGNvbXBvbmVudCBpcyBvYnRhaW5lZCBieSBzcXVhcmluZyB0aGVzZS4gVGhlIHNjcmVlIHBsb3Qgc2hvd3MgdGhlc2UgdmFyaWFuY2VzLiBXZSBzZWUgYmVsb3cgdGhhdCB3aXRob3V0IHNjYWxpbmcsIG1vc3Qgb2YgdGhlIHZhcmlhYmlsaXR5IGlzIGV4cGxhaW5lZCBieSB0aGUgZmlyc3QgcHJpbmNpcGFsIGNvbXBvbmVudCwgd2hpY2ggaXMgZXNzZW50aWFsbHkgdGhlIGBBYXN1bHRgIHZhcmlhYmxlLgoKYGBge3J9CnBhcihtZnJvdyA9IGMoMSwgMikpCnNjcmVlcGxvdChwY2FfY292LCB0eXBlID0gImwiLCBtYWluID0gIlNjcmVlOiBjb3ZhcmlhbmNlIFBDQSIpCnNjcmVlcGxvdChwY2FfY29yLCB0eXBlID0gImwiLCBtYWluID0gIlNjcmVlOiBjb3JyZWxhdGlvbiBQQ0EiKQpgYGAKCgoKCiMjIEV4cGxhaW5lZCB2YXJpYW5jZSBhbmQgY3VtdWxhdGl2ZSBQVkUKCmBgYHtyfQpwdmVfY292IDwtIChwY2FfY292JHNkZXZeMikgLyBzdW0ocGNhX2NvdiRzZGV2XjIpCnB2ZV9jb3IgPC0gKHBjYV9jb3Ikc2Rldl4yKSAvIHN1bShwY2FfY29yJHNkZXZeMikKCnB2ZV9jb3IKCnBhcihtZnJvdyA9IGMoMSwgMikpCgpwbG90KGN1bXN1bShwdmVfY292KSwgdHlwZSA9ICJiIiwgcGNoID0gMTksCiAgICAgeGxhYiA9ICJQcmluY2lwYWwgQ29tcG9uZW50IiwKICAgICB5bGFiID0gIkN1bXVsYXRpdmUgUFZFIiwKICAgICBtYWluID0gIkN1bXVsYXRpdmUgUFZFIChDb3ZhcmlhbmNlIFBDQSkiLAogICAgIHlsaW0gPSBjKDAsIDEpKQphYmxpbmUoaCA9IDAuOCwgY29sID0gInJlZCIsIGx0eSA9IDIpCmFibGluZShoID0gMC45LCBjb2wgPSAiYmx1ZSIsIGx0eSA9IDIpCgpwbG90KGN1bXN1bShwdmVfY29yKSwgdHlwZSA9ICJiIiwgcGNoID0gMTksCiAgICAgeGxhYiA9ICJQcmluY2lwYWwgQ29tcG9uZW50IiwKICAgICB5bGFiID0gIkN1bXVsYXRpdmUgUFZFIiwKICAgICBtYWluID0gIkN1bXVsYXRpdmUgUFZFIChDb3JyZWxhdGlvbiBQQ0EpIiwKICAgICB5bGltID0gYygwLCAxKSkKYWJsaW5lKGggPSAwLjgsIGNvbCA9ICJyZWQiLCBsdHkgPSAyKQphYmxpbmUoaCA9IDAuOSwgY29sID0gImJsdWUiLCBsdHkgPSAyKQoKcGFyKG1mcm93ID0gYygxLCAxKSkKYGBgCgpBZnRlciBzY2FsaW5nLCB3ZSBzZWUgdGhhdCB0aGUgZmlyc3QgcHJpbmNpcGFsIGNvbXBvbmVudCBleHBsYWlucyAkNjIuMCBcJSQgb2YgdGhlIHZhcmlhbmNlIGluIHRoZSBkYXRhLCB0aGUgbmV4dCBwcmluY2lwYWwgY29tcG9uZW50IGV4cGxhaW5zICQyNC43IFwlJCBvZiB0aGUgdmFyaWFuY2UsIGFuZCBzbyBmb3J0aC4KCgojIyBOdW1iZXIgb2YgUENzOiBLYWlzZXIgcnVsZSBhbmQgcGFyYWxsZWwgYW5hbHlzaXMgKHNjYWxlZCBQQ0EpCgpgYGB7cn0KIyBLYWlzZXIgcnVsZToga2VlcCBlaWdlbnZhbHVlcyA+IDEgZm9yIGNvcnJlbGF0aW9uLWJhc2VkIFBDQQpzY3JlZXBsb3QocGNhX2NvciwgdHlwZSA9ICJsIiwgbWFpbiA9ICJTY3JlZTogY29ycmVsYXRpb24gUENBIikKYWJsaW5lKGggPSAxLCBjb2wgPSAicmVkIiwgbHR5ID0gMikKCiMjIE9wdGlvbmFsOiBhbm5vdGF0ZQp0ZXh0KDIsIDEuMSwgIkthaXNlciB0aHJlc2hvbGQgPSAxIiwgY29sID0gInJlZCIpCmBgYAoKYGBge3J9CiMgUGFyYWxsZWwgYW5hbHlzaXMKc2V0LnNlZWQoMjQ2MjApCkIgPC0gNTAwCm4gPC0gbnJvdyhYKQpwIDwtIG5jb2woWCkKZWlnIDwtIHBjYV9jb3Ikc2Rldl4yCgpyYW5kX2VpZ3MgPC0gcmVwbGljYXRlKEIsIHsKICBYX3JhbmQgPC0gbWF0cml4KHJub3JtKG4gKiBwKSwgbnJvdyA9IG4sIG5jb2wgPSBwKQogIGVpZ2VuKGNvcihYX3JhbmQpLCBzeW1tZXRyaWMgPSBUUlVFLCBvbmx5LnZhbHVlcyA9IFRSVUUpJHZhbHVlcwp9KQoKcmFuZF9tZWFuIDwtIHJvd01lYW5zKHJhbmRfZWlncykKcmFuZF85NSA8LSBhcHBseShyYW5kX2VpZ3MsIDEsIHF1YW50aWxlLCBwcm9icyA9IDAuOTUpCgpwYV9jdXRvZmYgPC0gYXBwbHkocmFuZF9laWdzLCAxLCBxdWFudGlsZSwgcHJvYnMgPSAwLjk1KQoKcGxvdChlaWcsIHR5cGUgPSAiYiIsIHBjaCA9IDE5LAogICAgIHhsYWIgPSAiUHJpbmNpcGFsIENvbXBvbmVudCIsCiAgICAgeWxhYiA9ICJFaWdlbnZhbHVlIiwKICAgICBtYWluID0gIlBhcmFsbGVsIEFuYWx5c2lzIikKbGluZXMocmFuZF9tZWFuLCB0eXBlID0gImIiLCBwY2ggPSAxNywgbHR5ID0gMSwgY29sID0gMikKbGluZXMocmFuZF85NSwgbHR5ID0gMiwgY29sID0gNCkKYGBgCgoKLSBLYWlzZXIgc3VnZ2VzdHMga2VlcGluZyBgMWAgUEMuCi0gUGFyYWxsZWwgYW5hbHlzaXMgKDk1JSBiYXNlbGluZSkgYWxzbyBzdWdnZXN0cyBrZWVwaW5nIGAxYCBQQy4KCgojIyBMb2FkaW5ncyBpbnRlcnByZXRhdGlvbiAoc2NhbGVkIFBDQSkKCmBgYHtyfQpyb3VuZChwY2FfY29yJHJvdGF0aW9uWywgMToyXSwgMykKcGFyKG1mcm93ID0gYygxLCAyKSkKYmFycGxvdChwY2FfY29yJHJvdGF0aW9uWywgMV0sIGxhcyA9IDIsIG1haW4gPSAiTG9hZGluZ3M6IFBDMSIsIGNvbCA9ICJzdGVlbGJsdWUiKQpiYXJwbG90KHBjYV9jb3Ikcm90YXRpb25bLCAyXSwgbGFzID0gMiwgbWFpbiA9ICJMb2FkaW5nczogUEMyIiwgY29sID0gInRvbWF0byIpCmBgYAoKLSBQQzEgcmVwcmVzZW50cyBvdmVyYWxsIGNyaW1pbmFsIGxldmVsLgotIFBDMiBjb250cmFzdHMgdXJiYW5pemF0aW9uIHdpdGggdmlvbGVudCBjcmltZSwgZXNwZWNpYWxseSBtdXJkZXIuCgojIyBTY29yZSBtYXAgKHN0YXRlcyBpbiBmaXJzdCB0d28gUENzKQoKVGhlIHNjb3JlIG1hdHJpeCBpcyBzdG9yZWQgaW4gYHBjYSR4YC4gVGhlICRrJHRoIGNvbHVtbiBpcyB0aGUgJGskdGggcHJpbmNpcGFsIGNvbXBvbmVudCBzY29yZSB2ZWN0b3IuCgoKYGBge3J9CnNjb3JlcyA8LSBwY2FfY29yJHhbLCAxOjJdCnBsb3Qoc2NvcmVzLCB0eXBlID0gIm4iLCB4bGFiID0gIlBDMSIsIHlsYWIgPSAiUEMyIiwKICAgICBtYWluID0gIlVTQXJyZXN0cyBzdGF0ZXMgcHJvamVjdGVkIHRvIChQQzEsIFBDMikiKQp0ZXh0KHNjb3JlcywgbGFiZWxzID0gcm93bmFtZXMoWCksIGNleCA9IDAuNjUsIGNvbCA9ICJncmF5MzAiKQphYmxpbmUoaCA9IDAsIHYgPSAwLCBsdHkgPSAyLCBjb2wgPSAiZ3JheTcwIikKYGBgCgojIyBTY29yZSBtYXAgKyBsb2FkaW5ncwoKV2UgY2FuIGhhdmUgYSBtb3JlIGluZm9ybWF0aXZlIHZpc3VhbGl6YXRpb24gYnkgYWRkaW5nIHRoZSBsb2FkaW5nIGRpcmVjdGlvbnMKYGBge3IsIGZpZy53aWR0aD04LCBmaWcuaGVpZ2h0PTZ9CmJpcGxvdChwY2FfY29yLCBzY2FsZSA9IDAsIGNleCA9IDAuNykKYGBgCi0gUEMxIHJlcHJlc2VudHMgb3ZlcmFsbCBjcmltaW5hbCBsZXZlbCwgc3RhdGVzIG9uIHRoZSBsZWZ0IGhhdmUgaGlnaCBjcmltaW5hbCByYXRlcyBvdmVyYWxsLgotIFBDMiBjb250cmFzdHMgdXJiYW5pemF0aW9uIHdpdGggdmlvbGVudCBjcmltZTogc3RhdGVzIGF0IHRoZSB0b3AgYXJlIG1vcmUgdXJiYW4sIHdoaWxlIHRob3NlIGF0IHRoZSBib3R0b20gdGVuZCB0byBoYXZlIGhpZ2hlciBtdXJkZXIgcmF0ZXMgcmVsYXRpdmUgdG8gdGhlaXIgb3ZlcmFsbCBjcmltZSBwcm9maWxlLgoKCgojIyBPcnRob2dvbmFsaXR5IGNoZWNrCgpgYGB7cn0Kcm91bmQoY292KHBjYV9jb3IkeCksIDMpCmBgYAoKCgojIyBXcmFwLXVwCgotIFNjYWxpbmcgbWF0dGVycyBoZXJlIGJlY2F1c2UgdmFyaWFibGUgbWFnbml0dWRlcyBhcmUgdmVyeSBkaWZmZXJlbnQuCi0gVXNlIG11bHRpcGxlIGNyaXRlcmlhIHRvIGNob29zZSBQQ3MgKHNjcmVlL1BWRSArIEthaXNlciArIHBhcmFsbGVsIGFuYWx5c2lzKS4KLSBJbnRlcnByZXQgbG9hZGluZ3MgKHZhcmlhYmxlcykgdG9nZXRoZXIgd2l0aCBzY29yZSBtYXAgKHN0YXRlcykuCg==