Re: [R] Cross Validation output

Donald Catanzaro, PhD Fri, 26 Sep 2008 10:19:11 -0700

Good Day All,

I have a negative binomial model that I created using the functionglm.nb() with the MASS library and I am performing a cross-validationusing the function cv.glm() from the boot library. I am reallyinterested in determining the performance of this model so I can haveconfidence (or not) when it might be applied elsewhere

If I understand the cv.glm() procedure correctly, the default costfunction is the average squared error and by running run cv.glm() in aloop many times I understand that I can calculate PRESS (PRedictiveError Sum of Squares = 1/n*Sum(all PEs) from the default output.


When I run a loop that is 10 times, my PRESS ~25

I have a few questions:

1) I must now confess my ignorance, how does one interpret my PRESS of25 ? Are there some internet resources that someone could point me toto help in the interpretation ? I've spent most of yesterday studyingup on things but feel like I am chasing my tail. Most of the resourcesare either way so heavy in theory that I can't puzzle them out or are acouple of paragraphs long and don't have example with data in them. Ismy PRESS in essence saying that my model performance is ~ 75% ? (Isuspect not, but I don't know thus I ask)

2) All my observations are spatial in nature and thus I would like toplot out spatially where the model is performing well and where it isnot. This would be somewhat akin to inspecting residuals in OLS. Isthere a way to output from cv.glm() the PEs for individual data points ?3) My previous idea was to look at AIC, BIC, McFaddenR2 and PseudoR2 asGoodness of Fit measures of each subset model. It appears that I canmodify the cost function of cv.glm() but I am not to confident in myability to write the correct cost function. Are there other validmeasures of GOF for my negative binomial model that I can substituteinto the cost function of cv.glm() ? Would anyone care to recommend one(or many) ?


Thanks in advance for your patience !

-Don

PS - if you've seen my previous posts, I've abandoned my 80/20 splitvalidation scheme.

--

-Don

Don Catanzaro, PhD                  Landscape Ecologist
[EMAIL PROTECTED]               16144 Sigmond Lane
479-751-3616                        Lowell, AR 72745

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross Validation output

Reply via email to