How to load dataset in apache spark?
Can I know sources of massive datasets?
On Wed, Jul 22, 2015 at 4:50 AM, Ron Gonzalez
wrote:
> I'd use Random Forest. It will give you better generalizability. There
> are also a number of things you can do with RF that allows to train on
> samples of the ma
I'd use Random Forest. It will give you better generalizability. There
are also a number of things you can do with RF that allows to train on
samples of the massive data set and then just average over the resulting
models...
Thanks,
Ron
On 07/21/2015 02:17 PM, Olivier Girardot wrote:
depends
depends on your data and I guess the time/performance goals you have for
both training/prediction, but for a quick answer : yes :)
2015-07-21 11:22 GMT+02:00 Chintan Bhatt :
> Which classifier can be useful for mining massive datasets in spark?
> Decision Tree can be good choice as per scalabilit