the purpose of validating indirect measures such as ROC curves. Biggest Purpose- It is useful while in more marketing /sales meeting context ;)
Also , Deciles specific performance is easy to explain and monitor for faster execution/re modeling. Regards, Ajay On Wed, Oct 8, 2008 at 4:01 AM, Frank E Harrell Jr <[EMAIL PROTECTED] > wrote: > Ajay ohri wrote: > >> This is an approach >> >> Run the model variables on hold out sample. >> >> Check and compare ROC curves between build and validation datasets. >> >> Check for changes in parameter estimates (co efficients of variables) p >> value and signs. >> >> Check for binning (response versus deciles of individual variables). >> >> Check concordance, and KS Statistic. >> A decile wise performance of the model in terms of predicted versus >> actual, rank ordering of deciles, helps in explaining the model to business >> audience who generally have some business specific input that may require >> scoring model to be tweaked. >> >> This assumes multicollinearity, outliers and missing value treatment have >> already been done, and holdout sample checks for overfitting. You can always >> rebuild the model using a different random holdout sample. >> >> A stable model would not change too much. >> >> In actual implementation , try and build real time triggers for deviations >> (%) between predicted and actual. >> >> Regards, >> >> Ajay >> > > I wouldn't recommend that approach but legitimate differences of opinion > exist on the subject. In particular I fail to see the purpose of validating > indirect measures such as ROC curves. > > Frank > > >> www.decisionstats.com <http://www.decisionstats.com> >> >> On Wed, Oct 8, 2008 at 1:33 AM, Frank E Harrell Jr < >> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: >> >> >> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> >> wrote: >> >> Hi Frank, >> >> Thanks for your feedback! But I think we are talking about two >> different >> things. >> >> 1) Validation: The generalization performance of the classifier. >> See, >> for example, "Studies on the Validation of Internal Rating >> Systems" by >> BIS. >> >> >> I didn't think the desire was for a classifier but instead was for a >> risk predictor. If prediction is the goal, classification methods >> or accuracy indexes based on classifications do not work very well. >> >> >> >> 2) Calibration: Correct calibration of a PD rating system means >> that the >> calibrated PD estimates are accurate and conform to the observed >> default >> rates. See, for instance, An Overview and Framework for >> PD Backtesting and Benchmarking, by Castermans et al. >> >> >> I'm unclear on what you mean here. Correct calibration of a >> predictive system means that the UNcalibrated estimates are accurate >> (i.e., they don't need any calibration). (What is PD?) >> >> >> >> Frank, you are referring the #1 and I am referring to #2. >> Nonetheless, I would never create a rating system if my model >> doesn't >> discriminate better than a coin toss. >> >> >> For sure >> Frank >> >> >> >> Regards, >> >> Pedro >> >> >> >> >> >> >> -----Original Message----- >> From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] >> <mailto:[EMAIL PROTECTED]>] Sent: Tuesday, October 07, >> 2008 11:02 AM >> To: Rodriguez, Pedro >> Cc: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>; >> r-help@r-project.org <mailto:r-help@r-project.org> >> Subject: Re: [R] How to validate model? >> >> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> >> wrote: >> >> Usually one validates scorecards with the ROC curve, Pietra >> Index, KS >> test, etc. You may be interested in the WP 14 from BIS >> (www.bis.org <http://www.bis.org>). >> >> Regards, >> >> Pedro >> >> >> No, the validation should be done using an absolute reliability >> (calibration) curve. You need to verify that at all levels of >> predicted >> >> risk there is agreement with the true probability of failure. >> An ROC curve does not do that, and I doubt the others do. A >> resampling-corrected loess calibration curve is a good approach >> as implemented in the Design package's calibrate function. >> >> Frank >> >> -----Original Message----- >> From: [EMAIL PROTECTED] >> <mailto:[EMAIL PROTECTED]> >> >> [mailto:[EMAIL PROTECTED] >> <mailto:[EMAIL PROTECTED]>] >> >> On Behalf Of Maithili Shiva >> Sent: Tuesday, October 07, 2008 8:22 AM >> To: r-help@r-project.org <mailto:r-help@r-project.org> >> Subject: [R] How to validate model? >> >> Hi! >> >> I am working on scorecard model and I have arrived at the >> regression >> equation. I have used logistic regression using R. >> >> My question is how do I validate this model? I do have hold >> out sample >> of 5000 customers. >> >> Please guide me. Problem is I had never used Logistic >> regression >> >> earlier >> >> neither I am used to credit scoring models. >> >> Thanks in advance >> >> Maithili >> >> ______________________________________________ >> R-help@r-project.org <mailto:R-help@r-project.org> mailing >> list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible >> code. >> >> ______________________________________________ >> R-help@r-project.org <mailto:R-help@r-project.org> mailing >> list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible >> code. >> >> >> >> >> >> -- Frank E Harrell Jr Professor and Chair School of >> Medicine >> Department of Biostatistics Vanderbilt University >> >> ______________________________________________ >> R-help@r-project.org <mailto:R-help@r-project.org> mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> >> -- >> Regards, >> >> Ajay Ohri >> http://tinyurl.com/liajayohri >> >> >> > > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics Vanderbilt University > -- Regards, Ajay Ohri http://tinyurl.com/liajayohri [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.