​Thanks Ted,

but I have a doubt that as the code ​above (line 4) in the spark-shell
shows myrdd is already a MyRDD, does that not make sense?

1 scala> val part = new org.apache.spark.HashPartitioner(10)
2 scala> val baseRDD = sc.parallelize(1 to 100000).map(x => (x,
"hello")).partitionBy(part).cache()
3 scala> val myrdd = baseRDD.customable(part)  // here customable is a
method added to the abstract RDD to create MyRDD
*4 myrdd: org.apache.spark.rdd.RDD[(Int, String)] = MyRDD[3]* at customable
at
5 <console>:28
6 scala> *myrdd.customMethod(bulk)*
*7 error: value customMethod is not a member of
org.apache.spark.rdd.RDD[(Int, String)]*

On Mon, Mar 28, 2016 at 12:50 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> bq.   def customable(partitioner: Partitioner): RDD[(K, V)] =
> self.withScope {
>
> In above, you declare return type as RDD. While you actually intended to
> declare MyRDD as the return type.
> Or, you can cast myrdd as MyRDD in spark-shell.
>
> BTW I don't think it is good practice to add custom method to base RDD.
>
> On Sun, Mar 27, 2016 at 9:44 AM, Tenghuan He <tenghua...@gmail.com> wrote:
>
>> Hi Ted,
>>
>> The codes are running in spark-shell
>>
>> scala> val part = new org.apache.spark.HashPartitioner(10)
>> scala> val baseRDD = sc.parallelize(1 to 100000).map(x => (x,
>> "hello")).partitionBy(part).cache()
>> scala> val myrdd = baseRDD.customable(part)  // here customable is a
>> method added to the abstract RDD to create MyRDD
>> myrdd: org.apache.spark.rdd.RDD[(Int, String)] = MyRDD[3] at customable at
>> <console>:28
>> scala> *myrdd.customMethod(bulk)*
>> *error: value customMethod is not a member of
>> org.apache.spark.rdd.RDD[(Int, String)]*
>>
>> and the customable method in PairRDDFunctions.scala is
>>
>>   def customable(partitioner: Partitioner): RDD[(K, V)] = self.withScope {
>>     new MyRDD[K, V](self, partitioner)
>>   }
>>
>> Thanks:)
>>
>> On Mon, Mar 28, 2016 at 12:28 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> Can you show the full stack trace (or top 10 lines) and the snippet
>>> using your MyRDD ?
>>>
>>> Thanks
>>>
>>> On Sun, Mar 27, 2016 at 9:22 AM, Tenghuan He <tenghua...@gmail.com>
>>> wrote:
>>>
>>>> ​Hi everyone,
>>>>
>>>>     I am creating a custom RDD which extends RDD and add a custom
>>>> method, however the custom method cannot be found.
>>>>     The custom RDD looks like the following:
>>>>
>>>> class MyRDD[K, V](
>>>>     var base: RDD[(K, V)],
>>>>     part: Partitioner
>>>>   ) extends RDD[(K, V)](base.context, Nil) {
>>>>
>>>>   def *customMethod*(bulk: ArrayBuffer[(K, (V, Int))]): myRDD[K, V] = {
>>>>   // ... custom code here
>>>>   }
>>>>
>>>>   override def compute(split: Partition, context: TaskContext):
>>>> Iterator[(K, V)] = {
>>>>   // ... custome code here
>>>>   }
>>>>
>>>>   override protected def getPartitions: Array[Partition] = {
>>>>   // ... custom code here
>>>>   }
>>>>
>>>>   override protected def getDependencies: Seq[Dependency[_]] = {
>>>>   // ... custom code here
>>>>   }
>>>> }​
>>>>
>>>> In spark-shell, it turns out that the overrided methods works well, but
>>>> when calling myrdd.customMethod(bulk), it throws out:
>>>> <console>:33: error: value customMethod is not a member of
>>>> org.apache.spark.rdd.RDD[(In
>>>> t, String)]
>>>>
>>>> Can anyone tell why the custom method can not be found?
>>>> Or do I have to add the customMethod to the abstract RDD and then
>>>> override it in custom RDD?
>>>>
>>>> PS: spark-version: 1.5.1
>>>>
>>>> Thanks & Best regards
>>>>
>>>> Tenghuan
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to