Re: [R] How to subset my data and at the same time keep the balance?

Brian Feeny Mon, 19 Nov 2012 18:46:17 -0800

Just curious, once you have a model that works well, does it make sense to then 
tune it against 100% of the dataset (with known outcomes)
so you can apply it to data you wish to predict for or is that a bad approach?

I have done like is explained in this thread many times, taken a sample, 
learned against it, and then tested on the remaining.  But this is using data
for which we know the predicted variable and can compare to validate.  So after 
your done, should you re-tune with the entire training set?

As for which method, I am using mostly SVM

Brian

On Nov 19, 2012, at 2:07 PM, Eddie Smith <eddie...@gmail.com> wrote:

> Thanks a lot! I got some ideas from all the replies and here is the final one.
> 
> newdata
> 
> select <- sample(nrow(newdata), nrow(newdata) * .7)
> data70 <- newdata[select,]  # select
> write.csv(data70, "data70.csv", row.names=FALSE)
> 
> data30 <- newdata[-select,]  # testing
> write.csv(data30, "data30.csv", row.names=FALSE)
> 
> Cheers
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to subset my data and at the same time keep the balance?

Reply via email to