Hi guys,


We're looking at storm to solve a message processing scenario that needs to
be horizontally scalable for high projected volume. The use case goes like
this:



1.- receive messages from external source.

2.- generate a set of messages from this external input, based on rules.

3.- persist these message sets into a DB. There is no update, the use case
is insert only.



Currently we have implemented this as a PoC using a Spout for step 1, a
bolt for step 2 and a bolt for step 3. There is no aggregation or
partitioning between steps, we're just doing shuffling between bolts and
looking to scale by just throwing nodes at it. We need to guarantee exactly
once processing - but bolt #3 is going to a database. How does one
guarantee that the scenario where the DB transaction is successful but for
some reason the spout decides to retry and we get duplicate entries?



We don't see the need for batching, and we don't quite see how Trident
would help us in this case. If you could offer suggestions of alternatives
(or point out what are we missing about Trident), we would be very grateful.



Thanks in advance,

JG

Reply via email to