On Feb 9, 2011, at 3:44 PM, Tim Howard wrote:
All,
Given a data frame and a list containing factor definitions for
certain columns, how can I apply those definitions from the list,
rather than doing it the standard way, as noted below. I'm lost in
the world of do.call, assign, paste, and can't find my way through.
For example:
#set up df
y <- data.frame(colOne = c(1,2,3), colTwo =
c("apple","pear","orange"))
factor.defs <- list(colOne = list(name = "colOne",
lvl = c(1,2,3,4,5,6)),
colTwo = list(name = "colTwo",
lvl = c("apple","pear","orange","fig","banana")))
#A standard way to define levels
y$colTwo <- factor(y$colTwo , levels =
c("apple","pear","orange","fig","banana"))
Here's a one item way of using factor.defs. I thought it would be
pretty easy to loop through it with lapply or do.call, but it's not
immediately obvious once I get down to the nitty gritty.
> y[factor.defs[[1]]$name] <- factor(y[[factor.defs[[1]]$name]] ,
levels= factor.defs[[1]]$lvl)
> y
colOne colTwo
1 1 apple
2 2 pear
3 3 orange
levels(y$colOne)
#[1] "1" "2" "3" "4" "5" "6"
Note the different uses of "[" and "[[" on each side of the assignment.
This works on your example, but I don't think it would leave the non-
targeted columns in place
y <- as.data.frame( lapply(factor.defs, function(x) { y[[x$name]] <-
factor(y[[x$name]] , levels= x$lvl) } ) )
y
colOne colTwo
1 1 apple
2 2 pear
3 3 orange
I wonder if I could leave out the as.data.frame part and make an
assignment in the parent.frame instead?
y <- within(y, lapply(factor.defs, function(x) { y[[x$name]] <-
factor(y[[x$name]] , levels= x$lvl) } ) )
y
colOne colTwo
1 1 apple
2 2 pear
3 3 orange
Looks promising. You should construct a more complex test set and
report back.
--
David.
# I'd like to use the definitions locally but also pass them (but
not the data) to a function,
# so, rather than defining each manually each time, I'd like to loop
through the columns,
# call them by name, find the definitions in the list and use them
from there. Before I try to loop
# or use some form of apply, I'd like to get a single factor
definition working.
# this doesn't seem to see the dataframe properly
do.call(factor,list((paste("y$",factor.defs[2][[1]]
$name,sep="")),levels=factor.defs[2][[1]]$lvl))
#adding "as.name" doesn't help
do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]]
$name,sep="")),levels=factor.defs[2][[1]]$lvl))
#Here's my attempt to mimic the standard way, using assign. Ha! what
a joke.
assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")),
do.call(factor, list(as.name(paste("y$",factor.defs[2][[1]]
$name,sep="")),
levels = factor.defs[2][[1]]$lvl)))
##Error in function (x = character(), levels, labels = levels,
exclude = NA, :
## object 'y$colTwo' not found
Any help or perspective (or better way from the beginning!) would be
greatly appreciated.
Thanks in advance!
Tim
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.