You can load dataset from CSV file and use VectorAssembler to assemble necessary columns into a single columns of vector type. The output column of VectorAssembler will be the features column which should be feed into ML estimator for model training. You can refer VectorAssembler document: http://spark.apache.org/docs/latest/ml-features.html#vectorassembler .
Thanks Yanbo 2016-08-10 4:16 GMT-07:00 Minudika Malshan <minudika...@gmail.com>: > Hi all, > > I'm using spark ml library and need to train a model using data extracted > from a CSV file. > I found that we can load datasets from LibSVM files to spark ML methods. > As far as i understood, the data should be represented as labeled points > in-order to feed the ml methods. > Is there a way to load dataset from a CSV file instead of a LibSVM file? > Or do I need to convert the CSV file to LibSVM format? If so, could you > please let me know a way to do that.? > Your help would be much appreciated. > > Thank you! > Minudika >