Hello,
First of all, you don't need as.data.frame(cbind(...)). It's much better
to simply do data.frame(...).
As for the conversion, the following function doesn't use randomness but
gets the job done
df <- data.frame(snr=c(1,2,3,4,5,6,7,8,9,10),
k1=c(1,1,4,2,3,2,2,5,2,2),
k2=c(1,2,3,2,1,2,1,3,3,2),
result=c(4,3,5,4,2,6,4,4,2,3))
fun <- function(x){
n <- length(x)
y <- rep(NA, n)
y[x < median(x)] <- 0
y[x > median(x)] <- 1
w <- which(x == median(x))
y[w[seq_len(n/2 - length(which(x < median(x))))]] <- 0
y[is.na(y)] <- 1
y
}
fun(df$k1)
fun(df$k2)
Hope this helps,
Rui Barradas
Em 07-05-2013 17:20, D. Alain escreveu:
Dear R-List,
I would like to recode categorial variables into binary data, so that all
values above median are coded 1 and all values below 0, separating each var
into two equally large groups (e.g. good performers = 0 vs. bad performers =1).
I have not succeeded so far in finding a nice solution to do that in R. I
thought there might be a better way than ordering each column and recoding the
first 50% into 0 and the second into 1. If I use ifelse I have a problem with
cases that share the same rank being all median.
e.g.
df<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(1,1,4,2,3,2,2,5,2,2),k2=c(1,2,3,2,1,2,1,3,3,2),result=c(4,3,5,4,2,6,4,4,2,3)))
now I want to recode k1 and k2 so that I have half of the values recoded 0 and
half recoded 1, split around the median point. The median of k1 is 2 which
would lead to unequal groupsize if used 2 as cutoff, so all values k1=2 should
be recoded 1 or 0 randomly until both categories have the same length.
something like
df.rec<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(0,0,1,0,1,1,0,1,0,1),k2=c(0,1,1,0,0,1,0,1,1,0),result=c(4,3,5,4,2,6,4,4,2,3)))
Can anyone help?
Thank you in advance.
Best wishes.
Alain
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.