Sören;

You need to somehow add back to the information that is in "l" that fact that it was sampled from a set with 4 elements. Since you didn't sample from a factor the level information was lost. Otherwise, you coud create that list with unique(l) which in this case only returns 3 elements:

set.l <- c("locA", "locB", "locC", "locD")

sapply(set.l,  function(x) l == x)
       locA  locB  locC  locD
 [1,] FALSE FALSE FALSE  TRUE
 [2,] FALSE FALSE FALSE  TRUE
 [3,] FALSE FALSE FALSE  TRUE
 [4,] FALSE FALSE FALSE  TRUE
 [5,] FALSE  TRUE FALSE FALSE
 [6,]  TRUE FALSE FALSE FALSE
 [7,]  TRUE FALSE FALSE FALSE
 [8,]  TRUE FALSE FALSE FALSE
 [9,] FALSE FALSE FALSE  TRUE
[10,]  TRUE FALSE FALSE FALSE

Its in the wrong orientation because "l" is actually a column vector, so t() fixes that and adding 0 to TRUE/FALSE returns 0/1:
t(sapply(set.l,  function(x) x == l))+0
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
locA    0    0    0    0    0    1    1    1    0     1
locB    0    0    0    0    1    0    0    0    0     0
locC    0    0    0    0    0    0    0    0    0     0
locD    1    1    1    1    0    0    0    0    1     0

m <- as.data.frame(t(sapply(set.l,  function(x) l == x))+0)
m

The one-liner would be:
m <- as.data.frame(t(sapply(c("locA", "locB", "locC", "locD"), function(x) l == x))+0)

You canalso you mapply but the result does not have the desired row names and the column names are the result of the sampling whcih seems to me potentially confusing:

>  mapply(function(x) x==set.l, l)+0
     locD locD locD locD locB locA locA locA locD locA
[1,]    0    0    0    0    0    1    1    1    0    1
[2,]    0    0    0    0    1    0    0    0    0    0
[3,]    0    0    0    0    0    0    0    0    0    0
[4,]    1    1    1    1    0    0    0    0    1    0

I see that Dimitris has already given you a perfectly workable solution, but these seem to tackle the problem from a different angle.

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

On Mar 7, 2009, at 8:39 AM, soeren.vo...@eawag.ch wrote:

How to I "recode" a factor into a binary data frame according to the factor levels:

### example:start
set.seed(20)
l <- sample(rep.int(c("locA", "locB", "locC", "locD"), 100), 10, replace=T) # [1] "locD" "locD" "locD" "locD" "locB" "locA" "locA" "locA" "locD" "locA"
### example:end

What I want in the end is the following:

m$locA: 0, 0, 0, 0, 0, 1, 1, 1, 0, 1
m$locB: 0, 0, 0, 0, 1, 0, 0, 0, 0, 0
m$locC: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
m$locD: 1, 1, 1, 1, 0, 0, 0, 0, 1, 0

Instead of 0, NA's would also be fine.

Thanks, Sören

--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to