Thanks Ted, but I have a doubt that as the code above (line 4) in the spark-shell shows myrdd is already a MyRDD, does that not make sense?
1 scala> val part = new org.apache.spark.HashPartitioner(10) 2 scala> val baseRDD = sc.parallelize(1 to 100000).map(x => (x, "hello")).partitionBy(part).cache() 3 scala> val myrdd = baseRDD.customable(part) // here customable is a method added to the abstract RDD to create MyRDD *4 myrdd: org.apache.spark.rdd.RDD[(Int, String)] = MyRDD[3]* at customable at 5 <console>:28 6 scala> *myrdd.customMethod(bulk)* *7 error: value customMethod is not a member of org.apache.spark.rdd.RDD[(Int, String)]* On Mon, Mar 28, 2016 at 12:50 AM, Ted Yu <yuzhih...@gmail.com> wrote: > bq. def customable(partitioner: Partitioner): RDD[(K, V)] = > self.withScope { > > In above, you declare return type as RDD. While you actually intended to > declare MyRDD as the return type. > Or, you can cast myrdd as MyRDD in spark-shell. > > BTW I don't think it is good practice to add custom method to base RDD. > > On Sun, Mar 27, 2016 at 9:44 AM, Tenghuan He <tenghua...@gmail.com> wrote: > >> Hi Ted, >> >> The codes are running in spark-shell >> >> scala> val part = new org.apache.spark.HashPartitioner(10) >> scala> val baseRDD = sc.parallelize(1 to 100000).map(x => (x, >> "hello")).partitionBy(part).cache() >> scala> val myrdd = baseRDD.customable(part) // here customable is a >> method added to the abstract RDD to create MyRDD >> myrdd: org.apache.spark.rdd.RDD[(Int, String)] = MyRDD[3] at customable at >> <console>:28 >> scala> *myrdd.customMethod(bulk)* >> *error: value customMethod is not a member of >> org.apache.spark.rdd.RDD[(Int, String)]* >> >> and the customable method in PairRDDFunctions.scala is >> >> def customable(partitioner: Partitioner): RDD[(K, V)] = self.withScope { >> new MyRDD[K, V](self, partitioner) >> } >> >> Thanks:) >> >> On Mon, Mar 28, 2016 at 12:28 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Can you show the full stack trace (or top 10 lines) and the snippet >>> using your MyRDD ? >>> >>> Thanks >>> >>> On Sun, Mar 27, 2016 at 9:22 AM, Tenghuan He <tenghua...@gmail.com> >>> wrote: >>> >>>> Hi everyone, >>>> >>>> I am creating a custom RDD which extends RDD and add a custom >>>> method, however the custom method cannot be found. >>>> The custom RDD looks like the following: >>>> >>>> class MyRDD[K, V]( >>>> var base: RDD[(K, V)], >>>> part: Partitioner >>>> ) extends RDD[(K, V)](base.context, Nil) { >>>> >>>> def *customMethod*(bulk: ArrayBuffer[(K, (V, Int))]): myRDD[K, V] = { >>>> // ... custom code here >>>> } >>>> >>>> override def compute(split: Partition, context: TaskContext): >>>> Iterator[(K, V)] = { >>>> // ... custome code here >>>> } >>>> >>>> override protected def getPartitions: Array[Partition] = { >>>> // ... custom code here >>>> } >>>> >>>> override protected def getDependencies: Seq[Dependency[_]] = { >>>> // ... custom code here >>>> } >>>> } >>>> >>>> In spark-shell, it turns out that the overrided methods works well, but >>>> when calling myrdd.customMethod(bulk), it throws out: >>>> <console>:33: error: value customMethod is not a member of >>>> org.apache.spark.rdd.RDD[(In >>>> t, String)] >>>> >>>> Can anyone tell why the custom method can not be found? >>>> Or do I have to add the customMethod to the abstract RDD and then >>>> override it in custom RDD? >>>> >>>> PS: spark-version: 1.5.1 >>>> >>>> Thanks & Best regards >>>> >>>> Tenghuan >>>> >>>> >>>> >>> >> >