Thanks guys.

This is not a big issue in general. More an annoyance and can be rather
confusing when encountered for the first time.


On 09/29/2016 02:05 AM, Jakob Odersky wrote:
> I agree with Sean's answer, you can check out the relevant serializer
> here 
> https://github.com/twitter/chill/blob/develop/chill-scala/src/main/scala/com/twitter/chill/Traversable.scala
>
> On Wed, Sep 28, 2016 at 3:11 AM, Sean Owen <so...@cloudera.com> wrote:
>> My guess is that Kryo specially handles Maps generically or relies on
>> some mechanism that does, and it happens to iterate over all
>> key/values as part of that and of course there aren't actually any
>> key/values in the map. The Java serialization is a much more literal
>> (expensive) field-by-field serialization which works here because
>> there's no special treatment. I think you could register a custom
>> serializer that handles this case. Or work around it in your client
>> code. I know there have been other issues with Kryo and Map because,
>> for example, sometimes a Map in an application is actually some
>> non-serializable wrapper view.
>>
>> On Wed, Sep 28, 2016 at 3:18 AM, Maciej Szymkiewicz
>> <mszymkiew...@gmail.com> wrote:
>>> Hi everyone,
>>>
>>> I suspect there is no point in submitting a JIRA to fix this (not a Spark
>>> issue?) but I would like to know if this problem is documented anywhere.
>>> Somehow Kryo is loosing default value during serialization:
>>>
>>> scala> import org.apache.spark.{SparkContext, SparkConf}
>>> import org.apache.spark.{SparkContext, SparkConf}
>>>
>>> scala> val aMap = Map[String, Long]().withDefaultValue(0L)
>>> aMap: scala.collection.immutable.Map[String,Long] = Map()
>>>
>>> scala> aMap("a")
>>> res6: Long = 0
>>>
>>> scala> val sc = new SparkContext(new
>>> SparkConf().setAppName("bar").set("spark.serializer",
>>> "org.apache.spark.serializer.KryoSerializer"))
>>>
>>> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
>>> 16/09/28 09:13:47 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 7)
>>> java.util.NoSuchElementException: key not found: a
>>>
>>> while Java serializer works just fine:
>>>
>>> scala> val sc = new SparkContext(new
>>> SparkConf().setAppName("bar").set("spark.serializer",
>>> "org.apache.spark.serializer.JavaSerializer"))
>>>
>>> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
>>> res9: Long = 0
>>>
>>> --
>>> Best regards,
>>> Maciej
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>

-- 
Best regards,
Maciej



---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to