If no one has a better solution, split it, take a sample of size X from both and put it back together.
hgwelec wrote: > > Dear members, > > Consider the following data frame (first 4 rows shown) > > > age sex class > 15 m low > 20 f high > 15 f low > 10 m low > > in my original data set i have 1200 rows and a class distribution of > low=0.3 and high=0.7 > > > My question : how can i create a new data frame as the one shown above but > with the 'high' class subsampled so that in the new data frame the class > distribution is low=0.5 and high=0.5? > > I tried looking at the sample function and prob option but all examples i > seen do not use an imbalanced class problem as the one shown above > > > Thank you in advance > > > Thank you in advance > -- View this message in context: http://r.789695.n4.nabble.com/Subsampling-oversampling-from-a-data-frame-tp3965771p3965827.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.