Time dependent covariates

Piecewise Cox models

Load the survival library. Read the addict data file (in counting format) into R:

library(survival)
setwd("D:\\TEMP")
dat <- read.table("addict.csv", header = TRUE, sep = ",")

Recode the addict data set into counting process (start, stop) format. Here we define a cutpoint at day 365 creating a new data frame called dat.cp:

dat.cp <- survSplit(data = dat, cut = 365, end = "stop", event = "status", start = "start", id = "iid")

Recode the clinic variable to make Clinic 2 the reference category:

dat.cp$clinic <- as.vector(ifelse(dat.cp$clinic == 2, 0, 1))

First of all we'll consider the period before 365 days. Create a new variable called t1 such that t1 = 1 if the time to event is less than or equal to 365 days and zero otherwise:

t1 <- rep(0, nrow(dat.cp))
t1[dat.cp$stop <= 365] <- 1
dat.cp$t1 <- t1

Interpretation of the clinic × t1 interaction is as follows:

Clinic (code) Time (code) Interaction
Clinic 1 (1) Less than 365 (1) 1 × 1 = 1
Clinic 1 (1) Greater than 365 (0) 1 × 0 = 0
Clinic 2 (0) Less than 365 (1) 0 × 1 = 0
Clinic 2 (0) Greater than 365 (0) 0 × 0 = 0

Using this coding, the reported hazard for the clinic × t1 interaction will be for Clinic 1 when time is less than or equal to 365 days. Next consider the period after 365 days. Create a new variable called t2 such that t2 = 1 if the time to event is greater than 365 days and one otherwise:

t2 <- rep(0, nrow(dat.cp))
t2[dat.cp$stop > 365] <- 1
dat.cp$t2 <- t2

Interpretation of the clinic × t2 interaction is as follows:

Clinic (code) Time (code) Interaction
Clinic 1 (1) Less than 365 (0) 1 × 0 = 0
Clinic 1 (1) Greater than 365 (1) 1 × 1 = 1
Clinic 2 (0) Less than 365 (0) 0 × 0 = 0
Clinic 2 (0) Greater than 365(1) 0 × 1 = 0

Using this coding, the reported hazard for the clinic × t2 interaction will be for Clinic 1 when time is greater than 365 days. Now fit the model:

addict.cph05 <- coxph(Surv(start, stop, event = status, type = "counting") ~ prison + dose + I(clinic * t1) + I(clinic * t2), method = "breslow", data = dat.cp)
summary(addict.cph05)

The effect of clinic is borderline when days on treatment is less than 365 days (P = 0.06). When days on treatment is greater than 365 days the effect of clinic is highly significant (P < 0.01). Note that the confidence interval for days on treatment > 365 days is considerably larger than that for days on treatment < 365 days. These results suggest a large difference in clinic survival times after one year on treatment with Clinic 2 always doing better than Clinic 1 at any one time.

Variable Subjects Failed Regression coefficient (SE) P Hazard ratio (95% CI)
Clinic × t1 118 87 0.4802 (0.2548) 0.06 1.62 (0.98 - 2.66) a
Clinic × t2 120 63 1.8103 (0.3861) < 0.01 6.11 (2.87 - 13.03)
Prison 238 150 0.3650 (0.1684) 0.03 1.44 (1.04 - 2.00)
Dose 238 150 -0.0353 (0.0064) < 0.01 0.96 (0.95 - 0.98)

a Interpretation: compared with the reference category (patients from Clinic 2) when days on treatment is less than 365, after adjusting for the effect of methadone dose and prison record, patients from Clinic 1 had 1.62 (95% CI 0.98 -- 2.66) times the daily hazard withdrawing from the treatment program.

Table 1. Cox proportional hazards regression model showing the effect of prison, methadone dose and the variable effect of time on clinic on the daily hazard of relapse.

Now compare observed survival (as estimated by the Kaplan-Meier technique) and predictions from the Cox model:

addict.km <- survfit(Surv(stop, status) ~ clinic, type = "kaplan-meier", data = dat)
dat.clin01 <- data.frame(clinic = 1, prison = factor(c(0)), dose = 50, t1 = 1, t2 = 0)
dat.clin02 <- data.frame(clinic = 0, prison = factor(c(0)), dose = 50, t1 = 1, t2 = 0)

plot(addict.km01, xlab = "Days to relapse", ylab = "Cumulative proportion to experience event", lty = c(1,2), mark.time = FALSE, main = "")
lines(survfit(addict.cph05, newdata = dat.clin01), conf.int = FALSE, mark.time = FALSE, col = "red", lty = 1)
lines(survfit(addict.cph05, newdata = dat.clin02), conf.int = FALSE, mark.time = FALSE, col = "red", lty = 2)
legend(x = "topright", legend = c("Clinic 1: Kaplan Meier","Clinic 2: Kaplan Meier", "Clinic 1: Cox model", "Clinic 2: Cox model"), lty = c(1,2,1,2), col = c("black", "black", "red", "red"), bty = "n", cex = 0.80)

The Cox model with a time dependent covariate provides a reasonable fit to the survival experience of patients from Clinic 2. Survival is underestimated for Clinic 1 patients up to day 550, then survival is substantially overestimated. Introduction of a third time period at around day 550 might improve this problem.

Counting process formulation

The most common type of time-dependent covariates are repeated measurements on a subject or a change in the subject's treatment. Both of these situations are easily handled by the counting process formulation. As an example consider the Stanford heart transplant study where the objective is to determine if patients receiving transplants live longer than those that don't (Crowley and Hu 1977). This analysis is a little tricky, because patients who are enrolled into the study spend a variable amount of time on a waiting list (that is, waiting for a suitable donor).

Consider the first three patients (details shown below). Patient 1 entered the program on day 0 and died, waiting for a transplant on day 50. Patient 2 entered the program on day 0 and died on 6 (again, while waiting for a transplant). Patient 3 entered on day 0, spent 1 day on the waiting list and was then transplanted. This patient died 16 days later.

id start stop event age year surgery transplant
1 0 50 1 -17.155 0.123 0 0
2 0 6 1 3.836 0.255 0 0
3 0 1 0 6.297 0.266 0 0
3 1 16 1 6.297 0.266 0 1

id: patient identifier.
start: days from entry into program.
stop: days from entry into program to event.
event: 0 = censored, 1 = died.
age: age in years minus 48.
year: date of acceptance into the program (years from 1 October 1967).
surgery: 0 = no previous surgery, 1 = previous surgery.
transplant: 0 = no transplant, 1 = transplant.

library(survival)
setwd("D:\\TEMP")
dat <- read.table("heart.csv", header = TRUE, sep = ",")

The first thing we could do is to stratify the analysis by transplant status, that is to consider transplant as a time-dependent strata:

heart.cph01 <- coxph(Surv(start, stop, event, type = "counting") ~ age + surgery + strata(transplant), data = dat, method = "breslow")
summary(heart.cph01)

Variable Subjects Failed Regression coefficient (SE) P Hazard ratio (95% CI)
Age 172 75 0.0321 (0.0141) 0.023 1.03 (1.00 - 1.06) a
Surgery 172 75 -0.7678 (0.3619) 0.03 0.46 (0.23 - 0.94)

R2 = 0.062.
a Interpretation: for yearly increases in the age at enrolment the daily hazard of death was 1.03 (95% CI 1.00 -- 1.06).

Table 2. Stratified Cox proportional hazards regression model showing the effect of age and prior surgery on the daily hazard of death.

An alternative is to consider transplant as a time-dependent covariate (rather than as a time-dependent strata). This provides a direct way of making comparisons between failure rates of the transplanted and untransplanted groups. In this model the effect of transplant status on each covariate (that is age and surgery) is assessed:

heart.cph02 <- coxph(Surv(start, stop, event, type = "counting") ~ (age + surgery) * transplant, data = dat, method = "breslow")
summary(heart.cph02)

Variable Subjects Failed Regression coefficient (SE) P Hazard ratio (95% CI)
Age 172 75 0.0138 (0.0181) 0.45 1.01 (0.98 - 1.05)
Surgery 172 75 -0.5457 (0.6109) 0.37 0.58 (0.17 - 1.92)
Transplant 172 75 0.1181 (0.3277) 0.72 1.12 (0.59 - 2.14)
Age × transplant 172 75 0.0348 (0.0273) 0.20 1.03 (0.98 - 1.09) a
Surgery × transplant 172 75 -0.2916 (0.7582) 0.70 0.75 (0.17 - 3.30) b

R2 = 0.070.
a Interpretation: for yearly increases in the age at enrolment the daily hazard of death after transplantation was 1.03 (95% CI 0.98 -- 1.09).
b Interpretation: where a patient had prior surgery, the daily hazard of death after transplantation was 0.75 (95% CI 0.17 -- 3.30).

Table 3. Cox proportional hazards regression model showing the effect of age, prior surgery, transplantation status, and the age-transplantation, prior surgery-transplantation interaction on the daily hazard of death.

Now assess the effect of year of entry into the program:

heart.cph03 <- coxph(Surv(start, stop, event, type = "counting") ~ (age + year) * transplant, data = dat, method = "breslow")
summary(heart.cph03)

Variable Subjects Failed Regression coefficient (SE) P Hazard ratio (95% CI)
Age 172 75 0.0155 (0.0173) 0.37 1.02 (0.98 - 1.05)
Year 172 75 -0.2735 (0.1058) < 0.01 0.76 (0.62 - 0.94)
Transplant 172 75 -0.5884 (0.5427) 0.28 0.55 (0.19 - 1.61)
Age × transplant 172 75 0.0339 (0.0279) 0.23 1.03 (0.98 - 1.09) a
Year × transplant 172 75 0.2013 (0.1425) 0.16 1.22 (0.92 - 1.62) b

R2 = 0.083.
a Interpretation: for yearly increases in the age at enrolment the daily hazard of death after transplantation was 1.03 (95% CI 0.98 -- 1.09).
b Interpretation: for yearly increases in the date of acceptance into the program, the daily hazard of death after transplantation was 1.22 (95% CI 0.92 -- 1.62).

Table 4. Cox proportional hazards regression model showing the effect of age, prior surgery, transplantation status, and the age-transplantation, prior surgery-transplantation interaction on the daily hazard of death.

A feature of these data is the dependence of survival time on the date of acceptance into the study. Unfortunately, the date of acceptance interaction with transplantation is also approaching significance but in the opposite direction. Together, these suggest that the overall quality of patient being admitted to the study was improving over calendar time, but the survival time of transplanted patients was not improving at the same rate. In fact, the sum of the two coefficients for year of acceptance would suggest a nearly constant survival pattern for the transplanted patients.

Compare the survival of two patients. Both have prior surgery and are enrolled into the program on 1 October 1967. The first patient is 30 years of age at enrolment, the second patient is 50 years of age. Both have not received a transplant.

dat.age30 <- data.frame(start = 0, stop = 183, age = 30 - 48, year = 0, surgery = 1, transplant = 0)
dat.age50 <- data.frame(start = 0, stop = 183, age = 50 - 48, year = 0, surgery = 1, transplant = 0)

plot(survfit(heart.cph03, newdata = dat.age30), xlim = c(0,200), xlab = "Survival (days)", ylab = "Cumulative proportion to experience event", lty = 1, conf.int = FALSE, mark.time = FALSE, main = "")
lines(survfit(heart.cph03, newdata = dat.age50), conf.int = FALSE, mark.time = FALSE, col = "red", lty = 1)
legend(x = "topright", legend = c("30 years of age, prior surgery", "50 years of age, prior surgery"), col = c("black", "red"), lwd = 1, lty = 1, bty = "n")