subject:"\[R\] r\-data partitioning considering two variables $character and numeric$"

Re: [R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread Ahmed Attia

Thanks Bert, worked nicely. Yes, genotypes with only one ID will be eliminated before partitioning the data. Best regards Ahmed Attia On Mon, Aug 27, 2018 at 8:09 PM, Bert Gunter wrote: > Just partition the unique stand_ID's and select on them using %in% , say: > > id <- unique(dataGenoty

Re: [R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread Bert Gunter

Sorry, my bad -- careless reading: you need to do the partitioning within genotype. Something like: by(dataGenotype, dataGenotype$Genotype, function(x){ u <- unique(x$standID) tst <- x$x2 %in% sample(u, floor(length(u)/2)) list(test = x[tst,], train = x[!tst,] }) This will give a

Re: [R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread MacQueen, Don via R-help

And yes, I ignored Genotype, but for the example data none of the stand_ID values are present in more than one Genotype, so it doesn't matter. If that's not true in general, then constructing the grp variable is a little more complex, but the principle is the same. -- Don MacQueen Lawrence Live

Re: [R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread MacQueen, Don via R-help

You could start with split() grp <- rep('', nrow(mydata) ) grp[mydata$stand_ID %in% c(7,9,67)] <- 'A-training' grp[mydata$stand_ID %in% c(3,18,20,21,32)] <- 'B-testing' split(mydata, grp) or perhaps grp <- ifelse( mydata$stand_ID %in% c(7,9,67) , 'A-training', 'B-testing' ) split(mydata, grp)

Re: [R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread Bert Gunter

Just partition the unique stand_ID's and select on them using %in% , say: id <- unique(dataGenotype$stand_ID) tst <- sample(id, floor(length(id)/2)) wh <- dataGenotype$stand_ID %in% tst ## logical vector test<- dataGenotype[wh,] train <- dataGenotype[!wh,] There are a million variations on this t

[R] r-data partitioning considering two variables (character and numeric)

2018-08-27 Thread Ahmed Attia

I would like to partition the following dataset (dataGenotype) based on two variables; Genotype and stand_ID, for example, for Genotype H13: stand_ID number 7 may go to training and stand_ID number 18 and 21 may go to testing. Genotypestand_IDInventory_date stemC mheight H13

Re: [R] r-data partitioning considering two variables (character and numeric)

Re: [R] r-data partitioning considering two variables (character and numeric)

Re: [R] r-data partitioning considering two variables (character and numeric)

Re: [R] r-data partitioning considering two variables (character and numeric)

Re: [R] r-data partitioning considering two variables (character and numeric)

[R] r-data partitioning considering two variables (character and numeric)

6 matches

Site Navigation

Mail list logo

Footer information