That worked, thank you both! Thanks also Aaron for the list of things I need to read up on - I hadn't heard of ClassTag before.
On Tue, Apr 1, 2014 at 5:10 PM, Aaron Davidson <[email protected]> wrote: > Koert's answer is very likely correct. This implicit definition which > converts an RDD[(K, V)] to provide PairRDDFunctions requires a ClassTag is > available for K: > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L1124 > > To fully understand what's going on from a Scala beginner's point of view, > you'll have to look up ClassTags, context bounds (the "K : ClassTag" > syntax), and implicit functions. Fortunately, you don't have to understand > monads... > > > On Tue, Apr 1, 2014 at 2:06 PM, Koert Kuipers <[email protected]> wrote: > >> import org.apache.spark.SparkContext._ >> import org.apache.spark.rdd.RDD >> import scala.reflect.ClassTag >> >> def joinTest[K: ClassTag](rddA: RDD[(K, Int)], rddB: RDD[(K, Int)]) : >> RDD[(K, Int)] = { >> >> rddA.join(rddB).map { case (k, (a, b)) => (k, a+b) } >> } >> >> >> On Tue, Apr 1, 2014 at 4:55 PM, Daniel Siegmann <[email protected] >> > wrote: >> >>> When my tuple type includes a generic type parameter, the pair RDD >>> functions aren't available. Take for example the following (a join on two >>> RDDs, taking the sum of the values): >>> >>> def joinTest(rddA: RDD[(String, Int)], rddB: RDD[(String, Int)]) : >>> RDD[(String, Int)] = { >>> rddA.join(rddB).map { case (k, (a, b)) => (k, a+b) } >>> } >>> >>> That works fine, but lets say I replace the type of the key with a >>> generic type: >>> >>> def joinTest[K](rddA: RDD[(K, Int)], rddB: RDD[(K, Int)]) : RDD[(K, >>> Int)] = { >>> rddA.join(rddB).map { case (k, (a, b)) => (k, a+b) } >>> } >>> >>> This latter function gets the compiler error "value join is not a member >>> of org.apache.spark.rdd.RDD[(K, Int)]". >>> >>> The reason is probably obvious, but I don't have much Scala experience. >>> Can anyone explain what I'm doing wrong? >>> >>> -- >>> Daniel Siegmann, Software Developer >>> Velos >>> Accelerating Machine Learning >>> >>> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 >>> E: [email protected] W: www.velos.io >>> >> >> > -- Daniel Siegmann, Software Developer Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 E: [email protected] W: www.velos.io
