I agree with almost all, except the last point. Since I have participated in wheel-reinvention lately, I agree with the bulk of your comment. I don't think the fix is as easy as you suspect, RSiteSearch won't help me find a function I need when I don't know the magic words. Some R functions have such unexpected names that only a fastidious source-code reader would find them ("pretty", for example). But I agree with your concern.
But, as far as the last one is concerned, I think you are mistaken. Explanation below. On Wed, Jan 4, 2012 at 8:19 AM, Max Kuhn <mxk...@gmail.com> wrote: > > (14) [OCD] For binary classification models, model the probability of > the first level of a factor as the event of interest (again, for > consistency) Note that glm() does not do this but most others use the > first level. > When the DV is thought of as 0 and 1, and 1 is an "event" "success" or "win" and 0 is a "non event" "failure" or "loss", if there is to be a single predicted probability, I want it to be the probability of the higher outcome. glm is doing the thing I want, and I don't know of others that go the other way, except PROC LOGISTIC in SAS. And that has a long history of causing confusion and despair. I'd like to consider adding one thing to your list, though. I have wished (in this list and elsewhere) that there were a more regular approach for calculating "newdata" objects that are used in predict. Many packages have re-invented this (datadist in rms, effects), and almost nobody here agreed with my wish for a more standard approach. But if there were a standard approach, it would be much easier to hold up R as an alternative to Stata when users pop up with "marginal effects tables" from Stata that are very difficult to reproduce with R. Regards, pj -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel