Hi Jeff, Yes, my question is more general perhaps Not about R programming, data exploration, or statistical theory. Just that in modelling texts external validation is set as "panacea" but "unreacheable", so they explain other methods as cross validation, bootstrapping, etc. Here I have new data for a previously constructed model (and already internally validated by bootstrapping), but have not found how to correctly and sufficiently make the external validation and by which means (all ends in just a plot? a % of correct classification?)
El mar., 8 ene. 2019 a las 17:08, Jeff Newmiller (<jdnew...@dcn.davis.ca.us>) escribió: > That said, the gist of the OP's outline is correct, and the main reason to > look elsewhere is to get more thorough advice on what statistical concerns > should be addressed than would be on topic here. > > One comment: reviewing plots of differences versus various independent > variables for systematic biases is a task R is particularly well suited > for, but discovering which plots highlight issues with your model or data > takes familiarity with your data (explore) and with theory (which you learn > elsewhere) and with R (which we can help with if you have more specific > questions). > > On January 8, 2019 10:50:14 AM PST, Bert Gunter <bgunter.4...@gmail.com> > wrote: > >This list is (mostly) about R programming. Your query is (mostly) about > >statistics. So you should post on a statistics site like > >stats.stackexchange.com > >not here; I am pretty sure you'll receive lots of answers there. > > > >Cheers, > >Bert > > > > > >Bert Gunter > > > >"The trouble with having an open mind is that people keep coming along > >and > >sticking things into it." > >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > >On Tue, Jan 8, 2019 at 10:18 AM Maria Eugenia Utgés > ><mariaeugen...@gmail.com> > >wrote: > > > >> Hi R-list, > >> We have constructed a hurdle model some time ago. > >> Now we were able to gather new data in the same city (38 new sites), > >and > >> want to do an external validation to see if the model still performs > >ok. > >> All the books and lectures I have read say its the best validation > >option > >> but... > >> I have made a (simple) search, but it seems that as having new data > >for a > >> model is rare, have not found anything with the depth enough so as to > >> reproduce it/adapt it to hurdle models. > >> > >> I have predicted the probability for non-zero counts > >> nonzero <- 1 - predict(final, newdata = datosnuevos, type = "prob")[, > >1] > >> > >> and the predicted mean from the count component > >> countmean <- predict(final, newdata = datosnuevos, type = "count") > >> > >> I understand that "newdata" is taking into account the new values for > >the > >> independent variables (environmental variables), is it? > >> > >> So, I have to compare the predicted values of y (calculated with the > >new > >> values of the environmental variables) with the new observed values. > >> > >> That would be using the model (constructed with the old values), > >having as > >> input the new variables, and having as output a "new" prediction, to > >be > >> contrasted with the "new" observed y. > >> > >> These comparison would be by means of AUC, correct classification, > >and/or > >> what other options? Results of the external validation would just be > >a % of > >> correct predicted values? plots? > >> > >> Need some guidance, sorry if the explanation was "basic" but needed > >to > >> write it in my own words so as not to miss any detail. > >> > >> Thank you very much in advance, > >> > >> María Eugenia Utgés > >> > >> CeNDIE-ANLIS > >> Buenos Aires > >> Argentina > >> a > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.