Note: This document is uploaded on Ryota Mugiyama’s website.
Hazard rate is defined as follows.
\[ r(t) = \lim_{\Delta t \rightarrow 0} \frac{\Pr(t \leq T < t + \Delta t | T \geq t)}{\Delta t} \] In discrete-time framework, this is reformalized as follows.
\[ r(t) = \Pr(T = t | T \geq t) \]
We can estimate this hazard rate by using binary logit model. If the baseline-hazard follows the bell-shaped curve (that is, quadatic function), the hazard rate model is represented as the following logit model. \[ r(t) = \frac{\exp(\beta_0 + \beta_1 t + \beta_2 t^2)}{1 + \exp(\beta_0 + \beta_1 t + \beta_2 t^2)} \] Or,
\[ \log \frac{r(t)}{1 - r(t)} = \beta_0 + \beta_1 t + \beta_2 t^2 \]
For example, consider we obtain the following estimates. \[ \log \frac{r(t)}{1 - r(t)} = -7 -0.3t -0.006t^2 \]
t <- 1:40
c1 <- 0.3*t - 0.006*t^2 - 7 # estimated slopes and intercept.
r1 <- exp(c1) / (1 + exp(c1)) # calculate predicted hazard rate.
plot(t, r1, type = "l", ylab = "Hazard rate", xlab = "Time")
We can calculate the survival rate by using the previous logit model. The survival rate is defined as follows.
\[ S(t) = \Pr(T > t) = \prod^t_{t = 1} (1 - r(t)) \] After we estimate predicted hazard rate, we can calculate the predicted survival rate. Following table is the image of calculating survival rate by the predicted hazard rate.
Time | Survival rate | Hazard rate |
---|---|---|
1 | \(S(1) = 1\) | \(\hat{h}(1)\) |
2 | \(\hat{S}(2) = \hat{S}(1) \times (1 - \hat{h}(1))\) | \(\hat{h}(2)\) |
3 | \(\hat{S}(3) = \hat{S}(2) \times (1 - \hat{h}(2))\) | \(\hat{h}(3)\) |
4 | \(\hat{S}(4) = \hat{S}(3) \times (1 - \hat{h}(3))\) | \(\hat{h}(4)\) |
s1 <- (rep(1, 40))
for (i in 2:40) {
s1[i] <- s1[i-1] * (1 - r1[i-1]) # calculate survival rate.
}
plot(t, s1, type = "l", ylab = "Survival rate", xlab = "Time")
Consider there are two heterogenious population; One is the group with high hazard, the other is with low hazard. \[ \mathrm{Group} \ A: \log \frac{r(t)}{1 - r(t)} = -7 -0.3t -0.006t^2 \] \[ \mathrm{Group} \ B: \log \frac{r(t)}{1 - r(t)} = -7.5 -0.3t -0.006t^2 \]
c2 <- c1 - 0.5 # lower hazard group (the difference is intercept only).
r2 <- exp(c2) / (1 + exp(c2))
s2 <- (rep(1, 40))
for (i in 2:40) {
s2[i] <- s2[i-1] * (1 - r2[i-1])
}
plot(t, r1, type = "l", ylab = "Hazard rate", xlab = "Time")
lines(t, r2, type = "l", lty = 2)
The survival rates are as follows.
plot(t, s1, type = "l", ylab = "Survival rate", xlab = "Time")
lines(t, s2, type = "l", lty = 2)