Terry, The fact that model.frame attaches xlevels to the terms based on factors in the input data.frame (and attaches dataClass based on the input data.frame), but the subsequent call to model.matrix is responsible for turning character vectors in the data.frame into factors (and then into contrasts) is part of the reason that you cannot use predict() on an lmObject created using a data.frame with character vectors in it.
> d <- data.frame(y=1:10, x=rep(LETTERS[1:3],c(3,3,4)), stringsAsFactors=FALSE) > fit <- lm(data=d, y~x) Warning message: In model.matrix.default(mt, mf, contrasts) : variable 'x' converted to a factor > predict(fit, newdata=data.frame(x=c("A","C"))) # expect c(2.0, 8.5) Error: variable 'x' was fitted with type "other" but type "factor" was supplied This is one way that changing the default stringsAsFactors=TRUE can cause problems. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Terry Therneau > Sent: Wednesday, March 30, 2011 10:29 AM > To: Prof Brian Ripley > Cc: r-help@r-project.org > Subject: Re: [R] Using xlevels > > I see the logic now. I think that more sentences in the > document would > be very helpful, however. What is written is very subtle. > I suggest the following small expansion for model.matrix.Rd: > > \item{data}{a data frame. If the object has a \code{terms} > attribute > then it is assumed to be the result of a call to \code{model.frame}, > otherwise \code{model.frame} will be called first.} > > I often forget that model.frames are not a class, but an "implied" > class based on the presence of a terms component. Many users, I > suspect, do not even have this starting knowledge. > > Off to make changes to model.frame.coxph and model.matrix.coxph... > > Thanks for the feeback. > > Terry > > > On Wed, 2011-03-30 at 16:36 +0100, Prof Brian Ripley wrote: > > On Wed, 30 Mar 2011, Terry Therneau wrote: > > > > > I'm working on predict.survreg and am confused about xlevels. > > > The model.frame method has the argument, but none of the standard > > > methods (model.frame.lm, model.frame.glm) appear to make > use of it. > > > > But I see this in predict.lm: > > > > m <- model.frame(Terms, newdata, na.action = na.action, > > xlev = object$xlevels) > > > > It is used to remap levels in newdata to those used in the fit. > > > > > > > > The documentation for model.matrix states: > > > xlev: to be used as argument of model.frame if data has > no "terms" > > > attribute. > > > > Well, the code says > > > > if (is.null(attr(data, "terms"))) > > data <- model.frame(object, data, xlev=xlev) > > > > > But the terms attribute has no xlevels information in it, > so I find this > > > statement completely confusing. Any insight is appreciated. > > > > It means exactly what it says: a 'data' argument with a terms > > attribute is considered to be a model frame. > > > > > > > > Terry Therneau > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.