Your discovery seems to have helped me, too. I'm not sure yet exactly what
my problem was, but adding the chill-storm dependency and registering the
BlizzardKryoFactory seems to have made it work. Now  to go back and see if
there's anything in my configuration that I added in desperation that's no
longer necessary.

Thanks!

Mark

On Tue, Mar 3, 2015 at 6:03 PM, Matthew Waymost <[email protected]>
wrote:

> I was able to solve my issue.
>
> Once I verified that all my simulation logic was valid, I started looking
> for reasons why the registrations in my decorator weren't being picked up.
> Knowing that this was a consistent behavior, as opposed to what I
> originally thought, helped greatly (thanks Bill).
>
> Ultimately, I found a component in chill that contains an implementation
> of KryoFactory. Substituting this for the default one storm provides solved
> my problem.
>
> In case someone else happens upon this with the same issue, I first had to
> add chill-storm as a dependency in sbt. Then I added the following to my
> topology's configuration:
>
> conf.put("com.twitter.chill.config.configuredinstantiator",
> "com.twitter.chill.ScalaKryoInstantiator")
> conf.setKryoFactory(classOf[com.twitter.chill.storm.BlizzardKryoFactory])
>
> The first line tells chill which KryoInstantiator to use (it has many).
> Also, with this in place, the KryoDecorator I have, which I still needed
> for my custom serialization, worked fine as well.
>
> Matthew
>
> On Mon, Mar 2, 2015 at 12:22 PM, Brunner, Bill <[email protected]>
> wrote:
>
>>  Yeah, I use the storm serializer out of the box… I had a chill
>> implementation a while back but didn’t notice much of an improvement  in my
>> case.  But my particular use case is not designed to be super fast so I
>> can’t really answer that irt a high performance system.   I’ve only ever
>> run into serialization problems with scala maps and the filterKeys method,
>> which is documented as unserializable anyway (and simple enough to work
>> around).
>>
>>
>>
>> *From:* Matthew Waymost [mailto:[email protected]]
>> *Sent:* Monday, March 02, 2015 2:50 PM
>> *To:* [email protected]
>> *Subject:* Re: KryoDecorator not working when setNumWorkers > 1
>>
>>
>>
>> I didn't realize that locally storm would optimize to not serialize, but
>> that makes total sense and is extremely helpful to know.
>>
>>
>>
>> I've had issues in the past with kryo not properly serializing scala case
>> classes, and I've solved by adding twitter/chill's scala registrations
>> before. So I assumed I would need the same thing here as I didn't see any
>> documentation indicating that they were already included.
>>
>>
>>
>> The custom serializer is for a class that uses MapProxy (which I need to
>> get away from using admittedly). Neither kryo nor chill have handled
>> MapProxy properly in the past, so that's what the custom serializer is for.
>>
>>
>>
>> I'll definitely take a much closer look at my serialization logic and see
>> if I can isolate the problem there.
>>
>>
>>
>> Out of curiosity, do you typically use java's built-in serialization
>> instead of kryo? I've read and heard that it's very slow and inefficient,
>> so I'd be interested in hearing your experience.
>>
>>
>>
>> On Mon, Mar 2, 2015 at 6:49 AM, Brunner, Bill <[email protected]>
>> wrote:
>>
>> The reason your code is working locally or with a single worker is
>> because there is no reason for serialization to happen when everything is
>> contained in the same JVM.  Once you add a worker, your parallelism hint
>> now has the opportunity to ship the tuples to another JVM, thus
>> serialization has to occur.  So the issue is not with an increasing number
>> of workers, it’s with your serialization.  I am using scala as well and
>> have yet to uncover an instance where I needed custom serialization… the
>> out of the box java serialization seems to work well.
>>
>>
>>
>> *From:* Matthew Waymost [mailto:[email protected]]
>> *Sent:* Friday, February 27, 2015 4:14 PM
>> *To:* [email protected]
>> *Subject:* KryoDecorator not working when setNumWorkers > 1
>>
>>
>>
>> Hi everybody,
>>
>>
>>
>> I'm a new user to storm and have hit a roadblock in getting my topology
>> to run over multiple workers.
>>
>>
>>
>> Our codebase is in scala and we send scala classes to storm, so I'm using
>> a kryo decorator to call to chill's scala registrar to add all the
>> serialization logic for scala classes to kryo. In addition, I have a custom
>> serializer than I'm adding in the same decorator.
>>
>>
>>
>> This has worked perfectly fine for me so far locally and on our cluster
>> until I tried turning up the number of workers on which the topology runs.
>> When I use conf.setNumWorkers to set the number of workers greater than 1,
>> the topology gives me InvalidClassExceptions when attempting to deserialize
>> our classes. Removing the setNumWorkers call such that the number of
>> workers stays at the default of 1 resolves the problem and everything runs
>> fine.
>>
>>
>>
>> I'm completely stumped as to why this is happening, and I'm not sure how
>> to diagnose the issue. I've tried the following:
>>
>>
>>
>> * Configure the decorator through storm.yaml instead of in source code on
>> all worker nodes and nimbus.
>>
>> * Kill the topology, shut down all worker nodes, nimbus, and zookeeper,
>> clear all temporary data, and bring it all back up.
>>
>> * Verify that everything is using the same version of storm
>>
>> * Searching google and staring at code
>>
>>
>>
>> Looking at what's going on in the UI, it doesn't fail at the very first
>> chance either. It appears only to fail around the part of the topology
>> where I have a parallelismHint set, which is a few steps in. So I'm
>> guessing it's directly a result of trying to run it over multiple workers,
>> but I don't know what to do with that info.
>>
>>
>>
>> We're running openjdk 7, zk 3.4.6, and storm 0.9.3 on gce. We've got 1 zk
>> server, 1 nimbus server, and 3 worker servers. The call to the topology is
>> made over drpc, and drpc is hosted on the nimbus server. The topology is
>> implemented using trident.
>>
>>
>>
>> Thanks for any help you can provide.
>>
>>
>>
>> Matthew
>>    ------------------------------
>>
>> This message, and any attachments, is for the intended recipient(s) only,
>> may contain information that is privileged, confidential and/or proprietary
>> and subject to important terms and conditions available at
>> http://www.bankofamerica.com/emaildisclaimer. If you are not the
>> intended recipient, please delete this message.
>>
>>
>>  ------------------------------
>> This message, and any attachments, is for the intended recipient(s) only,
>> may contain information that is privileged, confidential and/or proprietary
>> and subject to important terms and conditions available at
>> http://www.bankofamerica.com/emaildisclaimer. If you are not the
>> intended recipient, please delete this message.
>>
>
>

Reply via email to