I tried wrapping my Tuple class (which is generated by Avro) in a
class that implements Serializable, but now I'm getting a
ClassNotFoundException in my Spark application. The exception is
thrown while trying to deserialize checkpoint state:

https://gist.github.com/joey/7b374a2d483e25f15e20c0c4cb81b5cf

I set some flags[1] on the JVM and I can see the class get loaded in the logs.

Does anyone have any suggestions/recommendations for debugging class
loading issues during checkpoint deserialization?

I also looked into switching to byte[] for the state keys, but byte[]
doesn't implement value-based equals() or hashCode(). I can't use
ByteBuffer because it doesn't implement Serializable. Spark has a
SerializableBuffer class that wraps ByteBuffer, but it also doesn't
have value-based equals() or hashCode().

-Joey

[1] -verbose:class -Dsun.misc.URLClassPath.debug

On Mon, Oct 10, 2016 at 11:28 AM, Joey Echeverria <j...@rocana.com> wrote:
> I do, I get the stack trace in this gist:
>
> https://gist.github.com/joey/d3bf040af31e854b3be374e2c016d7e1
>
> The class it references, com.rocana.data.Tuple, is registered with
> Kryo. Also, this is with 1.6.0 so if this behavior changed/got fixed
> in a later release let me know.
>
> -Joey
>
> On Mon, Oct 10, 2016 at 9:54 AM, Shixiong(Ryan) Zhu
> <shixi...@databricks.com> wrote:
>> That's enough. Did you see any error?
>>
>> On Mon, Oct 10, 2016 at 5:08 AM, Joey Echeverria <j...@rocana.com> wrote:
>>>
>>> Hi Ryan!
>>>
>>> Do you know where I need to configure Kryo for this? I already have
>>> spark.serializer=org.apache.spark.serializer.KryoSerializer in my
>>> SparkConf and I registered the class. Is there a different
>>> configuration setting for the state map keys?
>>>
>>> Thanks!
>>>
>>> -Joey
>>>
>>> On Sun, Oct 9, 2016 at 10:58 PM, Shixiong(Ryan) Zhu
>>> <shixi...@databricks.com> wrote:
>>> > You can use Kryo. It also implements KryoSerializable which is supported
>>> > by
>>> > Kryo.
>>> >
>>> > On Fri, Oct 7, 2016 at 11:39 AM, Joey Echeverria <j...@rocana.com>
>>> > wrote:
>>> >>
>>> >> Looking at the source code for StateMap[1], which is used by
>>> >> JavaPairDStream#mapWithState(), it looks like state keys are
>>> >> serialized using an ObjectOutputStream. I couldn't find a reference to
>>> >> this restriction in the documentation. Did I miss that?
>>> >>
>>> >> Unless I'm mistaken, I'm guessing there isn't a way to use Kryo for
>>> >> this serialization?
>>> >>
>>> >> Thanks!
>>> >>
>>> >> -Joey
>>> >>
>>> >> [1]
>>> >>
>>> >> https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/util/StateMap.scala#L251
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> -Joey
>>
>>
>
>
>
> --
> -Joey



-- 
-Joey

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to