You asked off-list, and provided a more detailed example there: val random = new Random() val testdata = (1 to 10000).map(_=>(random.nextInt(),random.nextInt())) sc.parallelize(testdata).combineByKey[ArrayBuffer[Int]]( (instant:Int)=>{new ArrayBuffer[Int]()}, (bucket:ArrayBuffer[Int],instant:Int)=>{bucket+=instant}, (bucket1:ArrayBuffer[Int],bucket2:ArrayBuffer[Int])=>{bucket1++=bucket2} ).collect()
https://www.quora.com/Why-is-my-combinebykey-throw-classcastexception I can't reproduce this with Spark 0.9.0 / CDH5 or Spark 1.0.0 RC9. Your definition looks fine too. (Except that you are dropping the first value, but that's a different problem.) On Tue, May 20, 2014 at 2:05 AM, xiemeilong <xiemeilong...@gmail.com> wrote: > I am using CDH5 on a three machines cluster. map data from hbase as (string, > V) pair , then call combineByKey like this: > > .combineByKey[C]( > (v:V)=>new C(v), //this line throw java.lang.ClassCastException: C > cannot be cast to V > (v:C,v:V)=>C, > (c1:C,c2:C)=>C) > > > I am very confused of this, there isn't C to V casting at all. What's > wrong? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/combinebykey-throw-classcastexception-tp6059.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.