4 Nonparametric Comparision of Survival Distributions

4.1 Comparing Two Groups of Survival Times

  • two-side test vs one-sided test
  • two-sample Students t-test
  • rank-based Mann-Whitney test

For survival analysis, nonparametric tests for

  • \(H_0:S_1(t)=S_0(t)\)
  • one-side: \(H_A:S_1(t)>S_0(t)\)
  • two-side: \(H_A: S_1(t) \neq S_0(t)\)

The relationship between \(S_1(t)\) and \(S_0(t)\) can be different with differnt \(t\)

Lehman alternation: \(H_A: S_1(t)=[S_0(t)]^\psi\), which is equivalent to \(h_1(t)=\psi h_0(t)\). Then the one sided test would be \(H_0:\psi=1\) vs \(H_1: \psi<1\).

Construct a two-by-two table for each failure time \(t_i\), with \(n_{0i}\) and \(n_{1i}\) being the numbers at risk in group 1 and 2, and \(d_{0i}\) and \(d_{1i}\) being the number of failures in group 1 and 2. Based on hypergeometric distribution, we can have

\[p(d_{0i}|n_{0i},n_{1i},d_i)=\frac{\binom{n_{0i}}{d_{0i}}\binom{n_{1i}}{d_{1i}}}{\binom{n_i}{d_i}}\] , where \[\binom{n}{d}=\frac{n!}{d!(n-d)!}\]

The expected mean \(e_{0i}\) and variance \(v_{0i}\) can be given by

\[e_{0i}=E(d_{0i})=\frac{n_{0i}d_i}{n_i}\]

\[v_{0i}=var(d_{0i})=\frac{n_{0i}n_{1i}d_i(n_i-d_i)}{n_i^2(n_i-1)}\]

We can sum the differences between the expected and observed values to get the test statistics \(U_0\) and its variance V_0

\[U_0=\sum_{i=1}^D(d_{0i}-e_{0i})=\sum d_{0i} - \sum e_{0i}\]

\[V_0=var(U_0)=\sum v_{0i}\]

Then the test statistic is

\[\frac{U_0}{\sqrt{V_0}} \sim N(0,1)\] , or

\[\frac{U_0^2}{V_0} \sim \chi^2_1\]

This test is known as the log-rank test.

library(survival)
tt = c(6, 7, 10, 15, 19, 25)
delta = c(1, 0, 1, 1, 0, 1)
trt = c(0, 0, 1, 0, 1, 1)  ## group 0 or 1
survdiff(Surv(tt, delta)~trt)
## Call:
## survdiff(formula = Surv(tt, delta) ~ trt)
## 
##       N Observed Expected (O-E)^2/E (O-E)^2/V
## trt=0 3        2     1.08     0.776      1.27
## trt=1 3        2     2.92     0.288      1.27
## 
##  Chisq= 1.3  on 1 degrees of freedom, p= 0.3

A full table is

The log-rank statistic is identical to Cochran-Mantel-Haenzel test in epidemiology, and may also be derived from the proportional hazards model.

A generalization is to define a weighted log-rank test using weights \(w_i\) for D time points,

\[U_0(w)=\sum w_i(d_{0i}-e_{0i})\]

\[var(U_0)=\sum w_i^2v_{0i}=V_0(w)\]

The most common way of setting weights is to sue the product-limit estimator from the combined sample

\[w_i=\{\hat{S}(t_i)\}^\rho\]

A log-rank test using these weights is called the Fleming-Harrington \(G(\rho)\) test. If \(\rho=0\), this test is equivalent to the log-rank test. If \(\rho=1\), the test is called Prentice modification/Peto-Peto modification of the Gehan-Wilcoxon test, which place higher weight on the earlier survival differences.

Back to Example 1.3

library(asaur)
attach(pancreatic)
Progression.d = as.Date(as.character(progression), "%m/%d/%Y")
OnStudy.d = as.Date(as.character(onstudy), "%m/%d/%Y")
Death.d = as.Date(as.character(death), "%m/%d/%Y")

progressionOnly = Progression.d - OnStudy.d
overallSurvival = Death.d - OnStudy.d
pfs = progressionOnly
pfs[is.na(pfs)] = overallSurvival[is.na(pfs)] ### PFS: progression or death, whichever comes first

pfs.month = pfs/30.5
plot(survfit(Surv(pfs.month) ~ stage), xlab="Time in months", ylab="Survival probability",
    col=c("blue", "red"), lwd=2)
legend("topright", legend=c("Locally advanced", "Metastatic"), col=c("blue","red") , lwd=2)

survdiff(Surv(pfs) ~ stage, rho=0) ### log-rank test
## Call:
## survdiff(formula = Surv(pfs) ~ stage, rho = 0)
## 
##           N Observed Expected (O-E)^2/E (O-E)^2/V
## stage=LA  8        8     12.3      1.49      2.25
## stage=M  33       33     28.7      0.64      2.25
## 
##  Chisq= 2.2  on 1 degrees of freedom, p= 0.1
survdiff(Surv(pfs) ~ stage, rho=1) ### Prentice modification
## Call:
## survdiff(formula = Surv(pfs) ~ stage, rho = 1)
## 
##           N Observed Expected (O-E)^2/E (O-E)^2/V
## stage=LA  8     2.34     5.88     2.128      4.71
## stage=M  33    18.76    15.22     0.822      4.71
## 
##  Chisq= 4.7  on 1 degrees of freedom, p= 0.03

4.2 Stratified Tests

If we need to compare two groups while adjusting for another covariate, we can

  • include the other covarite (or multiple covarites) as regression terms for the hazard function (following chapters)
  • use stratified log-rank test, if the covariate we are adjusting for is categorical with a small number of levels \(G\)

The p value is 0.00299 for Prentice modification test, but 0.134 for log-rank test. From Figure, we can see the metastatic group shows an early survival advantage over the locally advanced group, but the survival curves converge after about 10 months.