A randomized clinical trial conducted by the Northern California Oncology Group (NCOG) compared two treatments for head and neck cancer: chemotherapy (Arm A of the trial, \(n = 51\) patients) and chemotherapy plus radiation (Arm B, \(n = 45\) patients). The data records the survival time in number of days past treatment for each patient. The variable \(d = 0\) represents that the patient is still alive on their final day of observation.
ncog <- read.table("https://web.stanford.edu/~hastie/CASI_files/DATA/ncog.txt", header = T)
ncog
library(survival)
km <- with(ncog, Surv(t, d))
km_fit <- survfit(Surv(t, d) ~ arm, data=ncog)
#summary(km_fit)
plot(km_fit, xlab="Days", main = 'Kaplan Meyer Plot', col = c("blue", "red"), conf.int = T)
This shows a pointwise \(0.95\) CI for the Kaplan Meier curve of each arm. We can have a more polished figure using ggplot2.
library(ggfortify)
## Loading required package: ggplot2
library(ggplot2)
autoplot(km_fit)
The plus signs represent the censored cases at a given time point.
surv_diff <- survdiff(Surv(t, d) ~ arm, data = ncog)
surv_diff
## Call:
## survdiff(formula = Surv(t, d) ~ arm, data = ncog)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## arm=A 51 42 32.5 2.77 5.24
## arm=B 45 31 40.5 2.22 5.24
##
## Chisq= 5.2 on 1 degrees of freedom, p= 0.02
The function returns a list of components, including:
Here the p-value is \(0.02\), indicating a rejection of the null that the two survival functions are the same in the two arms.
cox <- coxph(Surv(t, d) ~ day + month + year + arm, data = ncog)
summary(cox)
## Call:
## coxph(formula = Surv(t, d) ~ day + month + year + arm, data = ncog)
##
## n= 96, number of events= 73
##
## coef exp(coef) se(coef) z Pr(>|z|)
## day 0.01219 1.01227 0.01471 0.829 0.4074
## month 0.01094 1.01100 0.03401 0.322 0.7477
## year 0.03593 1.03659 0.07496 0.479 0.6317
## armB -0.53677 0.58464 0.24601 -2.182 0.0291 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## day 1.0123 0.9879 0.9835 1.0419
## month 1.0110 0.9891 0.9458 1.0807
## year 1.0366 0.9647 0.8950 1.2006
## armB 0.5846 1.7105 0.3610 0.9469
##
## Concordance= 0.577 (se = 0.039 )
## Likelihood ratio test= 6.15 on 4 df, p=0.2
## Wald test = 6.02 on 4 df, p=0.2
## Score (logrank) test = 6.15 on 4 df, p=0.2
cox_fit <- survfit(cox)
autoplot(cox_fit, surv.colour = "red")
This is the predicted survival curve for a single “pseudo” subject with covariate values equal to the means of the data set (often not really make sense in practice).