On Thu, Apr 1, 2010 at 3:05 AM, Peter Dalgaard <pda...@gmail.com> wrote: > Jeff Brown wrote: >> Sorry for spamming. I swear I had worked on that problem a long time before >> posting. >> >> But I just figured it out: I have to change the values, which are >> represented as integers, not strings. So the following code will do it: >> >> df <- data.frame ( >> a = factor( c( "bob", "alice", "bob" ) ), >> b = factor( c( "kenny", "alice", "alice" ) ) >> ); >> allLevels <- unique( c( levels( df$a ), levels( df$b ) ) ) >> for (c in colnames(df)) { >> df[,c] <- match( df[,c], allLevels); >> levels( df[,c] ) <- 1:(length(allLevels)) >> }; >> > > Hmm, I think I'd go for something like > > allLevels <- unique(unlist(lapply(df,levels))) > df[] <- lapply(df, factor, > levels=allLevels, labels=seq_along(allLevels))
This behaviour always catches me out: levels(f) <- l is very different to f <- factor(f, levels = l) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.