Your key needs to implement hashCode in addition to equals.

Matei

On May 13, 2014, at 3:30 PM, Michael Malak <michaelma...@yahoo.com> wrote:

> Is it permissible to use a custom class (as opposed to e.g. the built-in 
> String or Int) for the key in groupByKey? It doesn't seem to be working for 
> me on Spark 0.9.0/Scala 2.10.3:
> 
> import org.apache.spark.SparkContext
> import org.apache.spark.SparkContext._
> 
> class C(val s:String) extends Serializable {
>   override def equals(o: Any) = if (o.isInstanceOf[C]) o.asInstanceOf[C].s == 
> s else false
>   override def toString = s
> }
> 
> object SimpleApp {
>   def main(args: Array[String]) {
>     val sc = new SparkContext("local", "Simple App", null, null)
>     val r1 = sc.parallelize(Array((new C("a"),11),(new C("a"),12)))
>     println("r1=" + r1.groupByKey.collect.mkString(";"))
>     val r2 = sc.parallelize(Array(("a",11),("a",12)))
>     println("r2=" + r2.groupByKey.collect.mkString(";"))
>   }
> }
> 
> 
> Output
> ======
> r1=(a,ArrayBuffer(11));(a,ArrayBuffer(12))
> r2=(a,ArrayBuffer(11, 12))

Reply via email to