On 18 Mar 2013, at 15:07, Prof Brian Ripley <rip...@stats.ox.ac.uk> wrote:
> On 18/03/2013 14:51, Cade, Brian wrote: >> Perhaps a crude but reliable way is to check the number of residuals, e.g., >> length(my.model$resid). > > Not very reliable (what about zero weights, for example?), and the component > is usually 'residuals'. > > No one has so far mentioned nobs(), which seems to me to be the closest. Given a my.data where 3 out of 100 rows will be discarded due to NAs test = lm(formula = y ~ x + w, my.data, model = T) nobs(test) [1] 97 # as expected But if I substitute 1 NA in one of the row of the housing data: house.plr = polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq) nobs(house.plr) [1] 1661 because of weights (which would not be take into account in a glm() fit). Because I only care about the raw number of observations, is there a (hopefully) trivial way of getting nobs(poor.fit) to behave like a nobs(vlm.fit)? BW Federico > >> Brian >> >> Brian S. Cade, PhD >> >> U. S. Geological Survey >> Fort Collins Science Center >> 2150 Centre Ave., Bldg. C >> Fort Collins, CO 80526-8818 >> >> email: ca...@usgs.gov <brian_c...@usgs.gov> >> tel: 970 226-9326 >> >> >> >> On Mon, Mar 18, 2013 at 8:39 AM, Marc Schwartz <marc_schwa...@me.com> wrote: >> >>> >>> On Mar 18, 2013, at 7:36 AM, Federico Calboli <f.calb...@imperial.ac.uk> >>> wrote: >>> >>>> Dear All, >>>> >>>> is there a simple way that covers all regression models to extract the >>> number of samples from a data frame/matrix actually used in a regression >>> model? >>>> >>>> For instance I might have a data of 100 rows and 4 colums (1 response + >>> 3 explanatory variables). If 3 samples have one or more NAs in the >>> explanatory variable columns these samples will be dropped in any model: >>>> >>>> my.model = lm(y ~ x + w + z, my.data) >>>> my.model = glm(y ~ x + w + z, my.data, family = binomial) >>>> my.model = polr(y ~ x + w + z, my.data) >>>> … >>>> >>>> I don't seem to be able to find one single method that works in the >>> exact same way -- irrespective of the model type -- to interrogate my.model >>> to see how many samples of my.data were actually used. Is there such >>> function or do I need to hack something together? >>>> >>>> Best wishes >>>> >>>> Federico >>> >>> >>> I don't know that this would be universal to all possible R model >>> implementations, but should work for those that at least abide by "certain >>> standards"[1] relative to the internal use of ?model.frame. >>> >>> In the case where model functions use 'model = TRUE' as the default in >>> their call (eg. lm(), glm() and MASS::polr()), the returned model object >>> will have a component called 'model', such that: >>> >>> nrow(my.model$model) >>> >>> returns the number of rows in the internally created data frame. >>> >>> Note that 'model = TRUE' is not the default for many functions, for >>> example Terry's coxph() in survival or Frank's lrm() in rms. >>> >>> Note also that the value of 'na.action' in the modeling function call may >>> have a potential effect on whether the number of rows in the retained >>> 'model' data frame is really the correct value. >>> >>> You can also use model.frame(), independently matching arguments passed to >>> the model function, to replicate what takes place internally in many >>> modeling functions. The result of model.frame() will be a data frame, >>> again, subject to similar limitations as above. >>> >>> Regards, >>> >>> Marc Schwartz >>> >>> [1]: http://developer.r-project.org/model-fitting-functions.txt >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> [[alternative HTML version deleted]] >> >> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.