<slaps self in forehead/> I appear to have misinterpreted the help: considering that it explicitly makes note of factors, I wrongly assumed that it would use the levels of a factor automatically. My bad.
For completeness' sake, my final solution: getLevels<-function(vec, includeNA=FALSE, onlyOccurring=FALSE) { if(onlyOccurring) { rv<-levels(factor(vec)) } else { rv<-levels(vec) } #cat("levels so far: ", rv, "\n") if(includeNA && any(is.na(vec))) { rv<-c(rv,NA) } #cat("levels with na: ", rv, "\n") return(rv) } expand.combs<-function(dfr, includeNA=FALSE, onlyOccurring=FALSE) { expand.grid(lapply(dfr, getLevels, includeNA, onlyOccurring)) } Thx. Nick Sabbe -- ping: nick.sa...@ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -----Original Message----- From: Berwin A Turlach [mailto:ber...@maths.uwa.edu.au] Sent: woensdag 19 januari 2011 11:04 To: Nick Sabbe Cc: r-help@r-project.org Subject: Re: [R] expand.grid G'day Nick, On Wed, 19 Jan 2011 09:43:56 +0100 "Nick Sabbe" <nick.sa...@ugent.be> wrote: > Given a dataframe > > dfr<-data.frame(c1=c("a", "b", NA, "a", "a"), c2=c("d", NA, "d", "e", > "e"), c3=c("g", "h", "i", "j", "k")) > > I would like to have a dataframe with all (unique) combinations of > all the factors present. Easy: R> expand.grid(lapply(dfr, levels)) c1 c2 c3 1 a d g 2 b d g 3 a e g 4 b e g 5 a d h 6 b d h 7 a e h 8 b e h 9 a d i 10 b d i 11 a e i 12 b e i 13 a d j 14 b d j 15 a e j 16 b e j 17 a d k 18 b d k 19 a e k 20 b e k > In fact, I would like a simple solution for these two cases: given > the three factor columns above, I would like both all _possible_ > combinations of the factor levels, and all _present_ combinations of > the factor levels (e.g. if I would do this for the first 4 rows of > dfr, it would contain no combinations with c3="k"). R> dfrpart <- lapply(dfr[1:4,], factor) R> expand.grid(lapply(dfrpart, levels)) c1 c2 c3 1 a d g 2 b d g 3 a e g 4 b e g 5 a d h 6 b d h 7 a e h 8 b e h 9 a d i 10 b d i 11 a e i 12 b e i 13 a d j 14 b d j 15 a e j 16 b e j > It would also be nice to be able to choose whether or not NA's are > included. R> expand.grid(lapply(dfrpart, function(x) c(levels(x), + if(any(is.na(x))) NA else NULL))) c1 c2 c3 1 a d g 2 b d g 3 <NA> d g 4 a e g 5 b e g 6 <NA> e g 7 a <NA> g 8 b <NA> g 9 <NA> <NA> g 10 a d h 11 b d h .... HTH. Cheers, Berwin ========================== Full address ============================ Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019) +61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009 e-mail: ber...@maths.uwa.edu.au Australia http://www.maths.uwa.edu.au/~berwin ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.