Parametric survival

The exponential distribution

Figure 1 was produced using the following code:

t <- seq(from = 1, to = 100, by = 1)
lambda = 0.25
ht <- lambda
Ht <- lambda * t
St <- exp(-lambda * t)

par(mfrow = c(1,3), pty = "s")
plot(t, rep(ht, times = length(t)), ylim = c(0, 1), lwd = 2, type = "s", xlab = "Time", ylab = "h(t)")
plot(t, Ht, ylim = c(0, 25), lwd = 2, type = "s", xlab = "Time", ylab = "H(t)")
plot(t, St, ylim = c(0, 1), lwd = 2, type = "s", xlab = "Time", ylab = "S(t)")

The Weibull distribution

Figure 2 was produced using the following code:

t <- seq(from = 1, to = 100, by = 1)
lambda = 0.25; p = 0.5
ht <- lambda * p * t^(p - 1)
Ht <- lambda * t^p
St <- exp(-lambda * t^p)

par(mfrow = c(1,3), pty = "s")
plot(t, ht, ylim = c(0, 0.5), lwd = 2, type = "s", xlab = "Time", ylab = "h(t)")
plot(t, Ht, ylim = c(0, 5), lwd = 2, type = "s", xlab = "Time", ylab = "H(t)")
plot(t, St, ylim = c(0, 1), lwd = 2, type = "s", xlab = "Time", ylab = "S(t)")

Plots of hazard using different values of lambda and p:

t <- seq(from = 0, to = 10, by = 0.1)
lambda <- 1; p05 <- 0.5; p10 <- 1.0; p15 <- 1.5; p30 <- 3.0
h05 <- lambda * p05 * (lambda * t)^(p05 - 1)
h10 <- lambda * p10 * (lambda * t)^(p10 - 1)
h15 <- lambda * p15 * (lambda * t)^(p15 - 1)
h30 <- lambda * p30 * (lambda * t)^(p30 - 1)

plot(t, h05, type = "l", ylim = c(0, 6), xlab = "Time", ylab = "h(t)", lty = 1, lwd = 1)
lines(t, h10, lty = 2, lwd = 1)
lines(t, h15, lty = 3, lwd = 1)
lines(t, h30, lty = 4, lwd = 1)
legend(x = "topright", legend = c("lambda = 1, p = 0.5", "lambda = 1, p = 1.0", "lambda = 1, p = 1.5", "lambda = 1, p = 3.0"), lty = c(1,2,3,4), lwd = c(1,1,1,1), bty = "n", cex = 0.75)

Comparison of Kaplan-Meier and Weibull estimates of survival:

setwd("D:\\TEMP")
library(survival)
dat <- read.table("addict.csv", header = TRUE, sep = ",")

Fit parametric (Weibull) and non-parametric (Kaplan-Meier) survival functions to the observed data:

addict.we <- survreg(Surv(stop, status) ~ 1, dist = "weib", data = dat)
addict.km <- survfit(Surv(stop, status) ~ 1, conf.type = "none", type = "kaplan-meier", data = dat)

Using the Weibull distribution m (the intercept) = -log(l) and s (scale) = 1 / p. Thus the scale parameter l = exp(-m) and p = 1 / s. See Venables and Ripley p 360 and Tableman and Kim p 78 for details.

p <- 1 / addict.we$scale
lambda <- exp(-addict.we$coeff[1])
t <- 1:1000
St <- exp(-(lambda * t)^p)
addict.we <- as.data.frame(cbind(t = t, St = St))

Compare the two estimates of survival:

plot(addict.km, xlab = "Days to relapse", ylab = "Cumulative proportion to experience event")
lines(addict.we$t, addict.we$St, lty = 2)
legend(x = "topright", legend = c("Kaplan-Meier", "Weibull"), lty = c(1,2), bty = "n")

The Weibull distribution provides an adequate fit to the observed data up to day 500, then appears to underestimate survivorship.

Cumulative hazard plots can provide an alternative method for assessing the appropriateness of a parametric approach to describe survivorship. Here we plot cumulative hazard as a function of time to check for consistency with the exponential distribution, log cumulative hazard as a function of log time to check for consistency with the Weibull distribution and the inverse normal transformation of the Kaplan-Meier estimates as a function of log time to check for consistency with the log-normal distribution.

par(pty = "s", mfrow = c(2,2))
plot(addict.km, conf.int = FALSE, fun = "cumhaz", xlab = "Days to relapse", main = "Exponential")
plot(addict.km, conf.int = FALSE, fun = "cumhaz", log = "xy", xlab = "Days to relapse (log)", main = "Weibull")
plot(addict.km, conf.int = FALSE, fun = qnorm, log = "x", xlab = "Days to relapse (log)", main = "Log-normal")