You asked off-list, and provided a more detailed example there:

    val random = new Random()
    val testdata = (1 to 10000).map(_=>(random.nextInt(),random.nextInt()))
    sc.parallelize(testdata).combineByKey[ArrayBuffer[Int]](
      (instant:Int)=>{new ArrayBuffer[Int]()},
      (bucket:ArrayBuffer[Int],instant:Int)=>{bucket+=instant},
      (bucket1:ArrayBuffer[Int],bucket2:ArrayBuffer[Int])=>{bucket1++=bucket2}
    ).collect()

https://www.quora.com/Why-is-my-combinebykey-throw-classcastexception

I can't reproduce this with Spark 0.9.0  / CDH5 or Spark 1.0.0 RC9.
Your definition looks fine too. (Except that you are dropping the
first value, but that's a different problem.)

On Tue, May 20, 2014 at 2:05 AM, xiemeilong <xiemeilong...@gmail.com> wrote:
> I am using CDH5 on a three machines cluster. map data from hbase as (string,
> V) pair , then call combineByKey like this:
>
> .combineByKey[C](
>       (v:V)=>new C(v),   //this line throw java.lang.ClassCastException: C
> cannot be cast to V
>       (v:C,v:V)=>C,
>       (c1:C,c2:C)=>C)
>
>
> I am very confused of this, there isn't C to V casting at all.  What's
> wrong?
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/combinebykey-throw-classcastexception-tp6059.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to