Basically, createDataPartition is used when you need to make one or more simple two-way splits of your data. For example, if you want to make a training and test set and keep your classes balanced, this is what you could use. It can also make multiple splits of this kind (or leave-group-out CV aka Monte Carlos CV aka repeated training test splits).
createFolds is exclusively for k-fold CV. Their usage is simular when you use the returnTrain = TRUE option in createFolds. Max On Sun, Oct 2, 2011 at 4:00 PM, Steve Lianoglou <mailinglist.honey...@gmail.com> wrote: > Hi, > > On Sun, Oct 2, 2011 at 3:54 PM, <bby2...@columbia.edu> wrote: >> Hi Steve, >> >> Thanks for the note. I did try the example and the result didn't make sense >> to me. For splitting a vector, what you describe is a big difference btw >> them. For splitting a dataframe, I now wonder if these 2 functions are the >> wrong choices. They seem to split the columns, at least in the few things I >> tried. > > Sorry, I'm a bit confused now as to what you are after. > > You don't pass in a data.frame into any of the > createFolds/DataPartition functions from the caret package. > > You pass in a *vector* of labels, and these functions tells you which > indices into the vector to use as examples to hold out (or keep > (depending on the value you pass in for the `returnTrain` argument)) > between each fold/partition of your learning scenario (eg. cross > validation with createFolds). > > You would then use these indices to keep (remove) the rows of a > data.frame, if that is how you are storing your examples. > > Does that make sense? > > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Max ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.