Hello,

First of all, you don't need as.data.frame(cbind(...)). It's much better to simply do data.frame(...). As for the conversion, the following function doesn't use randomness but gets the job done



df <- data.frame(snr=c(1,2,3,4,5,6,7,8,9,10),
        k1=c(1,1,4,2,3,2,2,5,2,2),
        k2=c(1,2,3,2,1,2,1,3,3,2),
        result=c(4,3,5,4,2,6,4,4,2,3))

fun <- function(x){
        n <- length(x)
        y <- rep(NA, n)
        y[x < median(x)] <- 0
        y[x > median(x)] <- 1
        w <- which(x == median(x))
        y[w[seq_len(n/2 - length(which(x < median(x))))]] <- 0
        y[is.na(y)] <- 1
        y
}

fun(df$k1)
fun(df$k2)



Hope this helps,

Rui Barradas

Em 07-05-2013 17:20, D. Alain escreveu:
Dear R-List,

I would like to recode categorial variables into binary data, so that all 
values above median are coded 1 and all values below 0, separating each var 
into two equally large groups (e.g. good performers = 0 vs. bad performers =1).

I have not succeeded so far in finding a nice solution to do that in R. I 
thought there might be a better way than ordering each column and recoding the 
first 50% into 0 and the second into 1. If I use ifelse I have a problem with 
cases that share the same rank being all median.

e.g.
df<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(1,1,4,2,3,2,2,5,2,2),k2=c(1,2,3,2,1,2,1,3,3,2),result=c(4,3,5,4,2,6,4,4,2,3)))

now I want to recode k1 and k2 so that I have half of the values recoded 0 and 
half recoded 1, split around the median point. The median of k1 is 2 which 
would lead to unequal groupsize if used 2 as cutoff, so all values k1=2 should 
be recoded 1 or 0 randomly until both categories have the same length.

something like

df.rec<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(0,0,1,0,1,1,0,1,0,1),k2=c(0,1,1,0,0,1,0,1,1,0),result=c(4,3,5,4,2,6,4,4,2,3)))

Can anyone help?

Thank you in advance.

Best wishes.
Alain
        [[alternative HTML version deleted]]



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to