We analyze again the dataset in HW1 Q2 which contains 12 observations from a (hypothetical) randomized experiment involving 12 patients recently diagnosed with type 2 diabetes. \(X\) is a measurement of glucose levels at the beginning of the experiment, and \(Y\) is a measurement of glucose levels at the end of the experiment. There are two treatments: taking a new drug (active treatment) or the standard drug (control treatment).
# read in data
Y0 <- c(105, 100, 101, 85, 94, 89)
Y1 <- c(89, 89, 82, 77, 83, 74)
Yobs = c(Y0,Y1)
X <- c(185,165,155,143,138,129,179,162,158,145,140,131)
W <- c(rep(0,6), rep(1,6))
n <- length(Yobs)
n.treat <- n.cont <- n/2
data <- data.frame(Yobs, X, W)
data
lm.result1 <- lm(Yobs ~ W, data = data)
## with the homoscedastic error assumption
sd <- summary(lm.result1)$coefficients["W", "Std. Error"]
est <- summary(lm.result1)$coefficients["W", "Estimate"]
## without the homoscedastic error assumption
library(sandwich)
## without the HC2 adjustment
vcovHC(lm.result1, type = "HC0")
(Intercept) W
(Intercept) 8.203704 -8.203704
W -8.203704 13.407407
sd1 <- sqrt(vcovHC(lm.result1, type = "HC0")[2, 2])
## with the HC2 adjustment
sd2 <- sqrt(vcovHC(lm.result1, type = "HC2")[2, 2])
result <- c(est, sd, sd1, sd2)
names(result) <- c("estimate", "Homoscedastic error", "Sandwich", "Sandwich with HC2")
print(result)
estimate Homoscedastic error Sandwich Sandwich with HC2
-13.333333 4.011096 3.661613 4.011096
## If we assume the conditional average treatment effect of the new drug to be the same across the pre-treatment glucose levels
lm.result2 <- lm(Yobs ~ W + X, data = data)
summary(lm.result2)
Call:
lm(formula = Yobs ~ W + X, data = data)
Residuals:
Min 1Q Median 3Q Max
-7.733 -1.779 -0.114 3.041 4.561
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 48.57925 10.28295 4.724 0.001083 **
W -13.33333 2.29655 -5.806 0.000258 ***
X 0.30877 0.06658 4.637 0.001224 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.978 on 9 degrees of freedom
Multiple R-squared: 0.8598, Adjusted R-squared: 0.8287
F-statistic: 27.61 on 2 and 9 DF, p-value: 0.0001445
## with the homoscedastic error assumption
sd <- summary(lm.result2)$coefficients["W", "Std. Error"]
est <- summary(lm.result2)$coefficients["W", "Estimate"]
## without the homoscedastic error assumption
## without the HC2 adjustment
sd1 <- sqrt(vcovHC(lm.result2, type = "HC0")[2, 2])
## with the HC2 adjustment
sd2 <- sqrt(vcovHC(lm.result2, type = "HC2")[2, 2])
result <- c(est, sd, sd1, sd2)
names(result) <- c("estimate", "Homoscedastic error", "Sandwich", "Sandwich with HC2")
print(result)
estimate Homoscedastic error Sandwich Sandwich with HC2
-13.333333 2.296553 1.988873 2.223436
We see that add pre-treatment covariates can indeed increase our estimation accuracy.
## If we allow the conditional average treatment effect of the new drug to be heterogenous across the pre-treatment glucose levels
data$X <- data$X - mean(data$X)
lm.result3 <- lm(Yobs ~ W + X + W:X, data = data)
summary(lm.result3)
Call:
lm(formula = Yobs ~ W + X + W:X, data = data)
Residuals:
Min 1Q Median 3Q Max
-7.6720 -1.9108 -0.2594 3.1327 4.5453
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 95.66667 1.72117 55.582 1.22e-11 ***
W -13.33333 2.43410 -5.478 0.000589 ***
X 0.31523 0.09263 3.403 0.009318 **
W:X -0.01540 0.14300 -0.108 0.916907
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.216 on 8 degrees of freedom
Multiple R-squared: 0.86, Adjusted R-squared: 0.8076
F-statistic: 16.39 on 3 and 8 DF, p-value: 0.0008897
## with the homoscedastic error assumption
sd <- summary(lm.result3)$coefficients["W", "Std. Error"]
est <- summary(lm.result3)$coefficients["W", "Estimate"]
## without the homoscedastic error assumption
## without the HC2 adjustment
sd1 <- sqrt(vcovHC(lm.result3, type = "HC0")[2, 2])
## with the HC2 adjustment
sd2 <- sqrt(vcovHC(lm.result3, type = "HC2")[2, 2])
result <- c(est, sd, sd1, sd2)
names(result) <- c("estimate", "Homoscedastic error", "Sandwich", "Sandwich with HC2")
print(result)
estimate Homoscedastic error Sandwich Sandwich with HC2
-13.333333 2.434099 1.987433 2.287243
Note that the p-values reported in R is not very accurate because we only have \(N=12\) samples. If \(N\) is not too small, we can approximate the distribution of our estimates by a Gaussian distribution, and obtain 95% CI by [est - 1.96sd, est + 1.96sd]