David, thanks. Your explanation does not quite fit, though, as it refers to using function data.frame, while I assigned the new column with $<-. poly() does return an object of classes poly and matrix, not model.matrix, and handing a poly object to function data.frame does behave like I would expect it to:
dat <- data.frame(X1=1:10, X2=LETTERS[1:10]) dat <- data.frame(dat, X1poly = poly(dat$X1,3)) dat ## five columns displayed ncol(dat) ## returns 5 colnames(dat) ## returns a vector of 5 names It is just the assignment with "$" that does behave differently - and not only for poly objects but for any matrix object. After I eventually remembered how to get to the documentation of extractors (?"$<-.data.frame"), I found this behavior documented there in the section on Coercion. Nevertheless, this does seem to contradict the understanding of what a data frame is. I am aware that data frames are lists, but they are of course special lists, requiring that all list elements have the same number of rows. So far I thought that all list elements also have the same number of columns, namely just one. In fact, the documentation of function data.frame states that "A data frame is a list of variables of the same length with unique row names, given class "data.frame".", which would imply such a rule. The possibility of a matrix with more than one column being a column of the data frame contradicts this piece of documentation, since the length of the matrix is not the same as the length of the other columns (e.g. length(poly(dat$X1,3) is 30, not 10 like for the other variables). Or would one consider the columns of the matrix X1poly the variables, but X1poly a column ? I'm not trying to be difficult, I just find this quite confusing and wonder about the consequences when using such a data frame in analyses. Regards, Ulrike David Winsemius wrote: > > Dataframes are lists. Look at dat with str and you will see that the > third column (actually the third list element) is a matrix. It's not > hard to find the documentation. If you read the documentation on the > help page for data.frame you should see this: > > "If a list or data frame or matrix is passed to data.frame it is as if > each component or column had been passed as a separate argument > (except for matrices of class"model.matrix" and those protected by I)." > > It seems reasonable that poly() returns an object that is considered a > model.matrix. > > On Jul 17, 2009, at 12:54 PM, Ulrike Grömping wrote: > >> >> Dear UseRs, >> >> I just learnt that the number of columns of a data frame is not >> always what >> I thought it to be, and I wonder where I should have learnt about >> this. >> Consider the following example: >> >> dat <- data.frame(X1=1:10, X2=LETTERS[1:10]) >> ncol(dat) ## evaluates to 2 (of course) >> dat$X1poly <- poly(dat$X1,3) >> dat ## five columns displayed >> ncol(dat) ## evaluates to 3 >> colnames(dat) ## three names (third is X1poly) >> colnames(dat)[3] <- "newname" >> dat ## all three previous X1poly columns renamed >> >> This appears intentional, as it treats the column names reasonably. >> Where is >> it documented ? Are there any other scenarios for which the number of >> columns displayed when printing a data frame does not coincide with >> ncol ? >> >> Regards, Ulrike > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/poly-objects-as-data-frame-columns-tp24538067p24540280.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.