from:"Pengcheng"

Re: use CrossValidatorModel for prediction

2016-10-02 Thread Pengcheng Luo

> On Oct 2, 2016, at 1:04 AM, Pengcheng wrote: > > Dear Spark Users, > > I was wondering. > > I have a trained crossvalidator model > model: CrossValidatorModel > > I wan to predict a score for features: RDD[Features] > > Right now I have to co

use CrossValidatorModel for prediction

2016-10-01 Thread Pengcheng

Dear Spark Users, I was wondering. I have a trained crossvalidator model *model: CrossValidatorModel* I wan to predict a score for *features: RDD[Features]* Right now I have to convert features to dataframe and then perform predictions as following: """ val sqlContext = new SQLContext(fea

SparkML RandomForest

2016-08-10 Thread Pengcheng

.setInputCol(*"category"*) .setOutputCol(*"label"*) *val *pipeline = *new *Pipeline().setStages(*Array*(indexer, rf)) *val *model: PipelineModel = pipeline.fit(trainingData) thanks, pengcheng

Re: spark application was submitted twice unexpectedly

2015-04-19 Thread Pengcheng Liu

looking into the work folder of problematic application, seems that the application is continuing creating executors, and error log of worker is as below: Exception in thread "main" java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs at org.apache.hadoop.security.UserG

spark application was submitted twice unexpectedly

2015-04-18 Thread Pengcheng Liu

the same time). One of the two applications will run to end, but the other will always stay in running state and never exit and release resources. Does anyone meet the same issue? The spark version I am using is spark1.1.1. Best Regards, Pengcheng -- View this message in context: http://apache

How to limit the number of concurrent tasks per node?

2015-01-06 Thread Pengcheng YIN

1. But this could take effect globally. My pipeline involves lots of other operations which I do not want to set limit on. Is there any better solution to fulfil the purpose? Thanks! Pengcheng

does calling cache()/persist() on a RDD trigger its immediate evaluation?

2015-01-03 Thread Pengcheng YIN

persisted `newRdd`. My concern is that, if RDD is not evaluated and persisted when persist() is called, I need to change the position of persist()/unpersist() called to make it more efficient. Thanks, Pengcheng - To

merge elements in a Spark RDD under custom condition

2014-12-01 Thread Pengcheng YIN

example, suppose RDD[Seq[Int]] = [[1,2,3], [2,4,5], [1,2], [7,8,9]], the result should be [[1,2,3,4,5], [7,8,9]]. Since RDD[Seq[Int]] is very large, I cannot do it in driver program. Is it possible to get it done using distributed groupBy/map/reduce, etc? Thanks in advance, Pengcheng

Re: use CrossValidatorModel for prediction

use CrossValidatorModel for prediction

SparkML RandomForest

Re: spark application was submitted twice unexpectedly

spark application was submitted twice unexpectedly

How to limit the number of concurrent tasks per node?

does calling cache()/persist() on a RDD trigger its immediate evaluation?

merge elements in a Spark RDD under custom condition

8 matches

Site Navigation

Mail list logo

Footer information