Re: [R] oversampling code

Weidong Gu Mon, 31 Oct 2011 17:44:44 -0700

You should figure out how many samples you want for Y=1 and 0, then
sample from the relevant subset dfrm[dfrm$Y==1] by sampling
row.names(dfrm[dfrm$Y==1] using replace=FALSE
?sample


On Mon, Oct 31, 2011 at 8:18 PM, Comcast <dwinsem...@comcast.net> wrote:
>
>
> On Oct 31, 2011, at 1:54 PM, loubna ibn majdoub hassani <loubn...@gmail.com> 
> wrote:
>
>> Hir
>> I have an umbalanced data set where I want to predict a binary variable Y.
>> I want to do an under sampling by keeping all the 1 and taking just some of
>> the 0 such as I'll have 90% of 0 and 10% of 1.
>
> ou haven' t given much detail , buteo thing like  this will take all of the 
> 1's and 10% of the 0's
>
> dfrm[c(rownames(dfrm[dorm$Y==1,]), sample(rownames(dfrm[dfrm$Y==0]), 0.10)) , 
> ]
>> Can u help me do that
>> Thank u
>>
>>    [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] oversampling code

Reply via email to