Casting to Tuple2 is easy, but the output of reduceByKey is presumably a
new Tuple2 instance so I'll need to map those to new instances of my class.
Not sure how much overhead will be added by the creation of those new
instances.

If I do that everywhere in my code though, it will make the code really
messy. That is why I was thinking of creating a wrapper which looks like
PairRDDFunctions which would cast to a pair RDD, delegate to
PairRDDFunctions, and then convert back to my class.

I was kinda hoping a Scala wizard would come along with some black magic
though.

On Wed, Nov 19, 2014 at 7:45 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> I think you should also be able to get away with casting it back and forth
> in this case using .asInstanceOf.
>
> On Wed, Nov 19, 2014 at 4:39 PM, Daniel Siegmann <daniel.siegm...@velos.io
> > wrote:
>
>> I have a class which is a subclass of Tuple2, and I want to use it with
>> PairRDDFunctions. However, I seem to be limited by the invariance of T
>> in RDD[T] (see SPARK-1296
>> <https://issues.apache.org/jira/browse/SPARK-1296>).
>>
>> My Scala-fu is weak: the only way I could think to make this work would
>> be to define my own equivalent of PairRDDFunctions which works with my
>> class, does type conversions to Tuple2, and delegates to PairRDDFunctions
>> .
>>
>> Does anyone know a better way? Anyone know if there will be a significant
>> performance penalty with that approach?
>>
>> --
>> Daniel Siegmann, Software Developer
>> Velos
>> Accelerating Machine Learning
>>
>> 54 W 40th St, New York, NY 10018
>> E: daniel.siegm...@velos.io W: www.velos.io
>>
>
>


-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

54 W 40th St, New York, NY 10018
E: daniel.siegm...@velos.io W: www.velos.io

Reply via email to