I would like to clarify statistics that ridge coxph returns. Here is my understanding and please correct me where I am wrong.
1) In his paper Gray [JASA 1992] suggests a Wald-type statistics with the formula for degree of freedom. The summary function for ridge coxph returns the degree of freedom and Wald test which are equivalent to what Gray wrote. 2) Summary for ridge coxph prints a likelihood ratio test which is not, if I may say, a proper likelihood ratio test. First it is based on unpenalized log-likelihoods and the above defined degree of freedom is used. I accept that there is nothing which suggests that one should use penalized log-likelihoods. However, there is also nothing published which suggests that unpenalized log-likelihoods should be used with the above defined degree of freedom. I have found that Therneau&Grambsch in their excellent book discuss this in a paragraph and mention that the p-value thus returned is somewhat conservative (p too large). Therefore, the likelihood ratio test that ridge coxph returns is not a true one and the statistics returned (i.e. 2*(loglik(beta)-loglik(0))) has the distribution which is somewhat more compact than the chi-square. I like conservative p-values and like to be on a safe side. However, in my work Wald test p-values for ridge regression are much higher than the "l! ikelihood ratio test's" p-value and I don't get impression that they are conservative. 3) There is no efficient score test for ridge regression, as there is no penalized efficient score test. That is OK. 4) "Rsquare" and "max possible" are returned and I have so far failed to find exact references for them. I would like to add a note here that the coxph algorithm works really fast with high-dimensional covariates and all compliments to people who developed it. There was evidently a lot of effort put to make it all work fast and correctly and, in my opinion, it is a shame that the last bit - summary for ridge coxph - is a bit, if I may say, shaky. In my opinion, only Wald test and degree of freedom as defined by Gray deserve to be part of summary for ridge coxph and I look forward to be corrected. On my behalf I am prepared in my spare time to write the code so that the summary for ridge coxph does not return NULL and that the statistics printed in summary for ridge coxph are based on published papers. Damjan Krstajic > Subject: Re: [R] survival: ridge log-likelihood workaround > From: thern...@mayo.edu > To: r-help@r-project.org; dkrsta...@hotmail.com > Date: Fri, 10 Dec 2010 09:07:42 -0600 > > ------ begin inclusion --------- > Dear all, > > I need to calculate likelihood ratio test for ridge regression. In > February I have reported a bug where coxph returns unpenalized > log-likelihood for final beta estimates for ridge coxph regression. In > high-dimensional settings ridge regression models usually fail for lower > values of lambda. As the result of it, in such settings the ridge > regressions have higher values of lambda (e.g. over 100) which means > that the difference between unpenalized log-likelihood and penalized > log-likelihood is not insignificant. I would be grateful if someone can > confirm that the below code is correct workaround. > > --- end included message ---- > > First, the "bug" you report is not a bug. The log partial likelihood > from a Cox model LPL(beta) is well defined for any vector of > coefficients beta, whether they are result of a maximization or taken > from your daily horoscope. The loglik component of coxph is the LPL for > the reported coefficients. > > For a ridge regression the coxph function maximizes LPL(beta) - > penalty(beta) = penalized partial likelihood = PPL(beta). You have > correctly recreated the PPL. > > Second: how do you do formal tests on such a model? This is hard. The > difference LPL1- LPL2 is a chi-square when each is the result of > maximizing the Cox LPL over a set of coefficients; when using a PPL we > are maximizing over something else. The distribution of the difference > of constrained LPL values can be argued to be a weighed sum of squared > normals where the weights are in (0,1), which is something more complex > than a chisq distribution. In a world with infinite free time I'd have > pursued this, worked it all out, and added appropriate code to coxph. > > What about the difference in PPL values, which is the test you propose? > I'm not aware of any theory showing that these have any relation to a > chi-square distribution. (Said theory may well exist, and I'd be happy > for pointers.) > > Terry Therneau > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.