Hi Deb, There is a saveAsLibSVMFile in MLUtils now. Also, I submitted a PR for standardizing text format of vectors and labeled point: https://github.com/apache/spark/pull/685
Best, Xiangrui On Sun, May 11, 2014 at 9:40 AM, Debasish Das <debasish.da...@gmail.com> wrote: > Hi, > > I need to change the toString on LabeledPoint to libsvm format so that I > can dump RDD[LabeledPoint] as a format that could be read by sparse > glmnet-R and other packages to benchmark mllib classification accuracy... > > Basically I have to change the toString of LabeledPoint and toString of > SparseVector.... > > Should I add it as a PR or is it already being added ? > > I added these functions toLibSvm in my internal util class for now... > > def toLibSvm(labelPoint: LabeledPoint): String = { > > labelPoint.label.toString + " " + > toLibSvm(labelPoint.features.asInstanceOf[SparseVector]) > > } > > def toLibSvm(features: SparseVector): String = { > > val indices = features.indices > > val values = features.values > > indices.zip(values).mkString(" > ").replace(',', ':').replace("(", "").replace(")","") > > } > Thanks. > Deb