subject:"Filtering RDD Using Spark.mllib's ChiSqSelector"

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

2016-07-19 Thread Tobi Bosede

Thanks Yanbo, will try that! On Sun, Jul 17, 2016 at 10:26 PM, Yanbo Liang wrote: > Hi Tobi, > > Thanks for clarifying the question. It's very straight forward to convert > the filtered RDD to DataFrame, you can refer the following code snippets: > > from pyspark.sql import Row > > rdd2 = filter

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

2016-07-17 Thread Yanbo Liang

Hi Tobi, Thanks for clarifying the question. It's very straight forward to convert the filtered RDD to DataFrame, you can refer the following code snippets: from pyspark.sql import Row rdd2 = filteredRDD.map(lambda v: Row(features=v)) df = rdd2.toDF() Thanks Yanbo 2016-07-16 14:51 GMT-07:00

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

2016-07-16 Thread Tobi Bosede

Hi Yanbo, Appreciate the response. I might not have phrased this correctly, but I really wanted to know how to convert the pipeline rdd into a data frame. I have seen the example you posted. However I need to transform all my data, just not 1 line. So I did sucessfully use map to use the chisq sel

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

2016-07-16 Thread Yanbo Liang

Hi Tobi, The MLlib RDD-based API does support to apply transformation on both Vector and RDD, but you did not use the appropriate way to do. Suppose you have a RDD with LabeledPoint in each line, you can refer the following code snippets to train a ChiSqSelectorModel model and do transformation:

Filtering RDD Using Spark.mllib's ChiSqSelector

2016-07-14 Thread Tobi Bosede

Hi everyone, I am trying to filter my features based on the spark.mllib ChiSqSelector. filteredData = vectorizedTestPar.map(lambda lp: LabeledPoint(lp.label, model.transform(lp.features))) However when I do the following I get the error below. Is there any other way to filter my data to avoid th

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

Re: Filtering RDD Using Spark.mllib's ChiSqSelector

Filtering RDD Using Spark.mllib's ChiSqSelector

5 matches

Site Navigation

Mail list logo

Footer information