Re: Get data from CSV files to feed SparkML library methods

Yanbo Liang Wed, 10 Aug 2016 06:39:59 -0700

You can load dataset from CSV file and use VectorAssembler to assemble
necessary columns into a single columns of vector type. The output column
of VectorAssembler will be the features column which should be feed into ML
estimator for model training. You can refer VectorAssembler document:
http://spark.apache.org/docs/latest/ml-features.html#vectorassembler .


Thanks
Yanbo

2016-08-10 4:16 GMT-07:00 Minudika Malshan <minudika...@gmail.com>:

> Hi all,
>
> I'm using spark ml library and need to train a model using data extracted
> from a CSV file.
> I found that we can load datasets from LibSVM files to spark ML methods.
> As far as i understood, the data should be represented as labeled points
> in-order to feed the ml methods.
> Is there a way to load dataset from a CSV file instead of a LibSVM file?
> Or do I need to convert the CSV file to LibSVM format? If so, could you
> please let me know a way to do that.?
> Your help would be much appreciated.
>
> Thank you!
> Minudika
>

Re: Get data from CSV files to feed SparkML library methods

Reply via email to