I would like to clarify statistics that ridge coxph returns. Here is my 
understanding and please correct me where I am wrong.

1) In his paper Gray [JASA 1992] suggests a Wald-type statistics with the 
formula for degree of freedom. The summary function for ridge coxph returns the 
degree of freedom and Wald test which are equivalent to what Gray wrote.
2) Summary for ridge coxph prints a likelihood ratio test which is not, if I 
may say, a proper likelihood ratio test. First it is based on unpenalized 
log-likelihoods and the above defined degree of freedom is used. I accept that 
there is nothing which suggests that one should use penalized log-likelihoods. 
However, there is also nothing published which suggests that unpenalized 
log-likelihoods should be used with the above defined degree of freedom. I have 
found that Therneau&Grambsch in their excellent book discuss this in a 
paragraph and mention that the p-value thus returned is somewhat conservative 
(p too large). Therefore, the likelihood ratio test that ridge coxph returns is 
not a true one and the statistics returned (i.e. 2*(loglik(beta)-loglik(0)))  
has the distribution which is somewhat more compact than the chi-square. I like 
conservative p-values and like to be on a safe side. However, in my work Wald 
test p-values for ridge regression are much higher than the "l!
 ikelihood ratio test's" p-value and I don't get impression that they are 
conservative.
3) There is no efficient score test for ridge regression, as there is no 
penalized efficient score test. That is OK.
4) "Rsquare" and "max possible" are returned and I have so far failed to find 
exact references for them.

I would like to add a note here that the coxph algorithm works really fast with 
high-dimensional covariates and all compliments to people who developed it.  
There was evidently a lot of effort put to make it all work fast and correctly 
and, in my opinion, it is a shame that the last bit - summary for ridge coxph - 
is a bit, if I may say, shaky. In my opinion, only Wald test and degree of 
freedom as defined by Gray deserve to be part of summary for ridge coxph and I 
look forward to be corrected. On my behalf I am prepared in my spare time to 
write the code so that the summary for ridge coxph does not return NULL and 
that the statistics printed in summary for ridge coxph are based on published 
papers.

Damjan Krstajic


> Subject: Re: [R] survival: ridge log-likelihood workaround
> From: thern...@mayo.edu
> To: r-help@r-project.org; dkrsta...@hotmail.com
> Date: Fri, 10 Dec 2010 09:07:42 -0600
> 
> ------ begin inclusion ---------
> Dear all,
> 
> I need to calculate likelihood ratio test for ridge regression. In
> February I have reported a bug where coxph returns unpenalized
> log-likelihood for final beta estimates for ridge coxph regression. In
> high-dimensional settings ridge regression models usually fail for lower
> values of lambda. As the result of it, in such settings the ridge
> regressions have higher values of lambda (e.g. over 100) which means
> that the difference between unpenalized log-likelihood and penalized
> log-likelihood is not insignificant. I would be grateful if someone can
> confirm that the below code is correct workaround.
> 
> --- end included message ----
> 
> First, the "bug" you report is not a bug.  The log partial likelihood
> from a Cox model LPL(beta) is well defined for any vector of
> coefficients beta, whether they are result of a maximization or taken
> from your daily horoscope.  The loglik component of coxph is the LPL for
> the reported coefficients.
> 
> For a ridge regression the coxph function maximizes LPL(beta) -
> penalty(beta) = penalized partial likelihood = PPL(beta).  You have
> correctly recreated the PPL.
> 
> Second: how do you do formal tests on such a model?  This is hard.  The
> difference LPL1- LPL2 is a chi-square when each is the result of
> maximizing the Cox LPL over a set of coefficients; when using a PPL we
> are maximizing over something else.  The distribution of the difference
> of constrained LPL values can be argued to be a weighed sum of squared
> normals where the weights are in (0,1), which is something more complex
> than a chisq distribution.  In a world with infinite free time I'd have
> pursued this, worked it all out, and added appropriate code to coxph.
> 
> What about the difference in PPL values, which is the test you propose?
> I'm not aware of any theory showing that these have any relation to a
> chi-square distribution.  (Said theory may well exist, and I'd be happy
> for pointers.)
> 
> Terry Therneau
> 
                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to