Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Lung



> l.survreg <- survreg(l.Surv ~ age, data=lung)
> summary(l.survreg)



Call:
survreg(formula = l.Surv ~ age, data = lung)
              Value Std. Error     z        p
(Intercept)  6.8871     0.4466 15.42 1.20e-53
age         -0.0136     0.0070 -1.94 5.19e-02
Log(scale)  -0.2761     0.0624 -4.43 9.61e-06

Scale= 0.759 

Weibull distribution
Loglik(model)= -1151.9   Loglik(intercept only)= -1153.9
	Chisq= 3.91 on 1 degrees of freedom, p= 0.048 
Number of Newton-Raphson Iterations: 5 
n= 228



> l.pred <- predict(l.survreg, lung)
> l.pred.q <- predict(l.survreg, lung, type="quantile")
> result <- cbind(data.frame(lung$time, l.pred), l.pred.q)
> names(result) <- c("Actual", "Predicted", "Lower", "Upper")
> head(result)



  Actual Predicted    Lower    Upper
1    306  357.8476 64.88637 673.7982
2    455  388.2918 70.40663 731.1220
3   1010  457.1705 82.89600 860.8152
4    210  450.9914 81.77557 849.1803
5    883  432.9505 78.50432 815.2108
6   1022  357.8476 64.88637 673.7982

Plot age versus time, and status, then draw line byt age with the predicted survival time:



> ord <-order(lung$age)
> age_ord <- lung$age[ord]
> pred_ord <- l.pred[ord]
> with(lung, plot(age, time, pch=status, col=4-status))
> lines(age_ord, pred_ord, col=4)
> legend("topleft", title="Status", c("Survived", "Died", "Predicted"), 
         pch=c(1, 2, -1), lty=c(0,0,1), col=c(3,2,4))

Image dmsurvivor-r:survival:survreg_plot

You can not plot the survreg directly, but could do the following, using the coefficients from the regression formula, which will also give a hint in interpreting the formula:



> l.survreg.weibull <- survreg(l.Surv ~ 1, data=lung, dist='weibull') 
> plot(survfit(l.Surv~1, data=lung)) 
> curve(exp(-(exp(-l.survreg.weibull$coef[1]) * x)^(1/l.survreg.weibull$scale)),
        col="red", add=TRUE)

Image dmsurvivor-156

Another example this time we fit a parametric survival model with a Weibull distribution for time to death fitting a different shape parameter for each gender, by using a strata term.



> l.survreg <- survreg(l.Surv ~ ph.ecog + age + strata(sex), lung)
> print(l.survreg)



Call:
survreg(formula = l.Surv ~ ph.ecog + age + strata(sex), data = lung)

Coefficients:
(Intercept)     ph.ecog         age 
 6.73234505 -0.32443043 -0.00580889 

Scale:
    sex=1     sex=2 
0.7834211 0.6547830 

Loglik(model)= -1137.3   Loglik(intercept only)= -1146.2
	Chisq= 17.8 on 2 degrees of freedom, p= 0.00014 
n=227 (1 observation deleted due to missingness)



> summary(l.survreg)



Call:
survreg(formula = l.Surv ~ ph.ecog + age + strata(sex), data = lung)
               Value Std. Error      z        p
(Intercept)  6.73235    0.42396 15.880 8.75e-57
ph.ecog     -0.32443    0.08649 -3.751 1.76e-04
age         -0.00581    0.00693 -0.838 4.02e-01
sex=1       -0.24408    0.07920 -3.082 2.06e-03
sex=2       -0.42345    0.10669 -3.969 7.22e-05

Scale:
sex=1 sex=2 
0.783 0.655 

Weibull distribution
Loglik(model)= -1137.3   Loglik(intercept only)= -1146.2
	Chisq= 17.8 on 2 degrees of freedom, p= 0.00014 
Number of Newton-Raphson Iterations: 5 
n=227 (1 observation deleted due to missingness)



Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010