kryo doesn't support guava's collections by default
I remember encountered project in github that fixes this(not sure though).
I've ended to stop using guava collections as soon as spark rdds are
concerned.

On 5 October 2015 at 21:04, Jakub Dubovsky <spark.dubovsky.ja...@seznam.cz>
wrote:

> Hi all,
>
>   I would like to have an advice on how to use ImmutableList with RDD. Small
> presentation of an essence of my problem in spark-shell with guava jar
> added:
>
> scala> import com.google.common.collect.ImmutableList
> import com.google.common.collect.ImmutableList
>
> scala> val arr = Array(ImmutableList.of(1,2), ImmutableList.of(2,4),
> ImmutableList.of(3,6))
> arr: Array[com.google.common.collect.ImmutableList[Int]] = Array([1, 2],
> [2, 4], [3, 6])
>
> scala> val rdd = sc.parallelize(arr)
> rdd:
> org.apache.spark.rdd.RDD[com.google.common.collect.ImmutableList[Int]] =
> ParallelCollectionRDD[0] at parallelize at <console>:24
>
> scala> rdd.count
>
>  This results in kryo exception saying that it cannot add a new element to
> list instance while deserialization:
>
> java.io.IOException: java.lang.UnsupportedOperationException
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)
>         at
> org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
>         ...
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.UnsupportedOperationException
>         at
> com.google.common.collect.ImmutableCollection.add(ImmutableCollection.java:91)
>         at
> com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:109)
>         at
> com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>         ...
>
>   It somehow makes sense. But I cannot think of a workaround and I do not
> believe that using ImmutableList with RDD is not possible. How this is
> solved?
>
>   Thank you in advance!
>
>    Jakub Dubovsky
>
>

Reply via email to