The Kaggle data is not in libsvm format so you'd have to do some transformation.
The Criteo and KDD cup datasets are if I recall fairly large. Criteo ad
prediction data is around 2-3GB compressed I think.
To my knowledge these are the largest binary classification datasets I've come
across
Nick Pentreath wrote
> Take a look at Kaggle competition datasets
> - https://www.kaggle.com/competitions
I was looking for files in LIBSVM format and never found something on Kaggle
in bigger size. Most competitions I ve seen need data processing and feature
generating, but maybe I ve to take a s
Take a look at Kaggle competition datasets - https://www.kaggle.com/competitions
For svm there are a couple of ad click prediction datasets of pretty large size.
For graph stuff the SNAP has large network data: https://snap.stanford.edu/data/
—
Sent from Mailbox
On Thu, Jul 3, 2014 at 3