Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)

Frank E Harrell Jr Mon, 13 Oct 2008 20:15:07 -0700

Robert W. Baer, Ph.D. wrote:

----- Original Message ----- From: "Frank E Harrell Jr"<[EMAIL PROTECTED]>
To: "John Sorkin" <[EMAIL PROTECTED]>
Cc: <r-help@r-project.org>; <[EMAIL PROTECTED]>;<[EMAIL PROTECTED]>
Sent: Monday, October 13, 2008 2:09 PM
Subject: Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)
John Sorkin wrote:
Frank,
Perhaps I was not clear in my previous Email message. Sensitivity andspecificity do tell us about the quality of a test in that given twotests the one with higher sensitivity will be better at identifyingsubjects who have a disease in a pool who have a disease, and themore sensitive test will be better at identifying subjects who do nothave a disease in a pool of people who do not have a disease. It istrue that positive predictive and negative predictive values are ofgreater utility to a clinician, but as you know these two measuresare functions of sensitivity, specificity and disease prevalence. Allother things being equal, given two tests one would select the onewith greater sensitivity and specificity so in a sense they domeasure the "quality" of a clinical test - but not, as I tried toexplain the quality of a statistical model.
That is not very relevant John. It is a function of all those thingsbecause those quantities are all deficient.
I would select the test that can move the pre-test probability a greatdeal in one or both directions.
Of course, this quantity is known as a likelihood ratio and is afunction of sensitivity and specificity. For 2 x 2 data one oftenspeaks of postive likelihood ratio and negative likelihood ratio, butfor multi-row contingency table one can define likelihood ratios for aseries of cut-off points. This has become a popular approach inevidence-based medicine when diagnostic tests have continuous ratherthan binary outputs.

This approach leaves much to be desired. I hope that its practitionersstart gauging it by the mean squared error of predicted probabilities.

Likelihood ratios are "half" of odds ratios (odds ratio = product of LR+and LR-) but in a practical sense they are not equivalent because thevast majority of likelihood ratios provided in the literature are crude,marginal, unadjusted likelihood ratios. Odds ratios from easy-to-fitlogistic models are conditional or partial odds ratios and so arepatient specific and not population averaged.


Frank

You are of course correct that sensitivity and specificity are nottruly "inherent" characteristics of a test as their values may changefrom population-to-population, but paretically speaking, they don'tchange all that much, certainly not as much as positive and negativepredictive values.
They change quite a bit, and mathematically must change if the diseaseis not all-or-nothing.
I guess we will disagree about the utility of sensitivity andspecificity as simplifying concepts.
Thank you as always for your clear thoughts and stimulating comments.
And thanks for yours John.
Frank
John
among those subjects with a disease and the one with greaterspecificity will be better at indentifying John David Sorkin M.D.,Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Frank E Harrell Jr <[EMAIL PROTECTED]> 10/13/2008 2:35 PM >>>
John Sorkin wrote:
Jumping into a thread can be like jumping into a den of lions buthere goes . . .Sensitivity and specificity are not designed to determine thequality of a fit (i.e. if your model is good), but rather arecharacteristics of a test. A test that has high sensitivity willproperly identify a large portion of people with a disease (or acharacteristic) of interest. A test with high specificity willproperly identify large proportion of people without a disease (orcharacteristic) of interest. Sensitivity and specificity inform theend user about the "quality" of a test. Other metrics have beendesigned to determine the quality of the fit, none that I know ofare completely satisfactory. The pseudo R squared is one such measure.For a given diagnostic test (or classification scheme), differentcut-off points for identifying subject who have disease can beexamined to see how they influence sensitivity and 1-specificityusing ROC curves.
I await the flames that will surely come my way

John
John this has been much debated but I fail to see how backwardsprobabilities are that helpful in judging the usefulness of a test.Why not condition on what we know (the test result and other baselinevariables) and quit conditioning on what we are trying to find out(disease status)? The data collected in most studies (other thancase-control) allow one to use logistic modeling with the correcttime order.
Furthermore, sensitivity and specificity are not constants but varywith subjects' characteristics. So they are not even useful assimplifying concepts.
Frank
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Frank E Harrell Jr <[EMAIL PROTECTED]> 10/13/2008 12:27 PM>>>
Maithili Shiva wrote:
Dear Mr Peter Dalgaard and Mr Dieter Menne,
I sincerely thank you for helping me out with my problem. The thingis taht I already have calculated SENS = Gg / (Gg + Bg) = 89.97%
and SPEC = Bb / (Bb + Gb) = 74.38%.
Now I have values of SENS and SPEC, which are absolute in nature.My question was how do I interpret these absolue values. How doesthese values help me to find out wheher my model is good.
With regards

Ms Maithili Shiva
I can't understand why you are interested in probabilities that arein backwards time order.
Frank
________________________________________________________________________
Subject: [R] Logistic regresion - Interpreting (SENS) and (SPEC)
To: r-help@r-project.org Date: Friday, October 10, 2008, 5:54 AM
Hi

Hi I am working on credit scoring model using logistic
regression. I havd main sample of 42500 clentes and based on
their status as regards to defaulted / non - defaulted, I
have genereted the probability of default.

I have a hold out sample of 5000 clients. I have calculated
(1) No of correctly classified goods Gg, (2) No of correcly
classified Bads Bg and also (3) number of wrongly classified
bads (Gb) and (4) number of wrongly classified goods (Bg).

My prolem is how to interpret these results? What I have
arrived at are the absolute figures.
--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)

Reply via email to