Seems like stateful processing, have you looked at using trident ? -Rajiv
> On Jan 20, 2015, at 12:26 PM, Kushan Maskey > <[email protected]> wrote: > > Thanks Keith and Itai, > > We are using fieldGrouping. Initially we were using suffleGrouping, we saw > this problem and then moved to fieldGrouping, with better result, until now. > I am thinking due to bolts parallelism which we have set it to 4, is the > culprit here. My understanding of parallelism is threading, correct me if I > am not incorrect. > > -- > Kushan Maskey > >> On Tue, Jan 20, 2015 at 1:03 PM, Itai Frenkel <[email protected]> wrote: >> Hello, >> >> Are you familiar with field grouping ? The idea is that the same bolt >> instance would always update the value of a specific key (similar to web >> load balancer cookie stickiness). >> https://storm.apache.org/documentation/Concepts.html >> "Fields grouping: The stream is partitioned by the fields specified in the >> grouping. For example, if the stream is grouped by the "user-id" field, >> tuples with the same "user-id" will always go to the same task, but tuples >> with different "user-id"'s may go to different tasks." >> >> >> Itai >> >> >> From: Kushan Maskey <[email protected]> >> Sent: Tuesday, January 20, 2015 8:55 PM >> To: [email protected] >> Subject: URGENT!! Race condition >> >> We are having a major issue trying to update Cassandra database where we see >> race condition in a bolt. >> >> Here is an example, >> >> I have a columnfamily, where i have 2 partitioning columns say X and Y. >> There is another columns Z which basically aggregated number. We are suppose >> to update Z based on X and Y. Storm is reading a huge volume of data from >> Kafka. When sport receives a message, first bolt reads the database for that >> combination of X and Y and get the value of Z. Then it updates the value Z >> and store it back into the database. Bolt parallelism is set to be 4 which >> mean 4 instances of bolt are trying to update the database. So when first >> bolt (B1) read the value of Z to be say 100, same time the second bolt (B2) >> also read it to be 100, but once B1 completed execution and the value of Z >> is now 150, B2 still has 100 so the value of Z is out of sync. >> >> How can we prevent the race condition like this? This is causing a major >> nuisance to us. >> >> Any help is highly appreciated. Thanks. >> >> -- >> Kushan Maskey >
