I am able to get around the problem by doing a map and getting the Event
out of the EventWritable before I do my collect. I think I'll  do that for
now.

On Tue, Feb 10, 2015 at 6:04 PM, Corey Nolet <[email protected]> wrote:

> I am using an input format to load data from Accumulo [1] in to a Spark
> RDD. It looks like something is happening in the serialization of my output
> writable between the time it is emitted from the InputFormat and the time
> it reaches its destination on the driver.
>
> What's happening is that the resulting Event object [2] inside the
> EventWritable [3] appears to have lost its Tuples [4]
>
>
> [1]
> https://github.com/calrissian/accumulo-recipes/blob/master/store/event-store/src/main/java/org/calrissian/accumulorecipes/eventstore/hadoop/EventInputFormat.java
> [2]
> https://github.com/calrissian/mango/blob/master/mango-core/src/main/java/org/calrissian/mango/domain/event/Event.java
> [3]
> https://github.com/calrissian/accumulo-recipes/blob/master/commons/src/main/java/org/calrissian/accumulorecipes/commons/hadoop/EventWritable.java
> [4]
> https://github.com/calrissian/mango/blob/master/mango-core/src/main/java/org/calrissian/mango/domain/Tuple.java
>
> I'm at a loss. I've tested using the SerializableWritable and serializing
> an EventWritable to an ObjectOutputStream in a unit test and it serialized
> fine without loss of data. I also verified that the Event object itself
> serializes and deserializes fine with an ObjectOutputStream. I'm trying to
> follow breakpoints through the code to figure out where exactly this may be
> happening but the objects all seem to be bytes already when passed into the
> JavaSerializerInstance (if I'm properly following what's going on, that
> is).
>
> Any ideas on what this may be? I'm using Spark 1.2.0 and Scala 2.10 but
> the business objects I'm using are from Java 1.7.
>
>
>

Reply via email to