Hi,
I know Storm is designed to run forever. I also know about Trident's
technique of aggregation. But shouldn't Storm have a way to let bolts know
that a certain bunch of processing has been completed?
Consider this topology:
Spout------>Bolt-A------>Bolt-B
| |--->Bolt-B
| \--->Bolt-B
|--->Bolt-A------>Bolt-B
| |--->Bolt-B
| \--->Bolt-B
\--->Bolt-A------>Bolt-B
|--->Bolt-B
\--->Bolt-B
- From Bolt-A to Bolt-B, it is a FieldsGrouping.
- Spout emits only a few tuples and then stops emitting.
- Bolt A takes those tuples and generates millions of tuples.
*Bolt-B accumulates tuples that Bolt A sends and needs to know when Spout
finished emitting. Only then can Bolt-B start writing to SQL.*
*Questions:*
1. How can all Bolts B be notified that it is time to write to SQL?
2. After all Bolts B have written to SQL, how to know that all Bolts B have
completed writing?
3. How to stop the topology? I know of
localCluster.killTopology("HelloStorm"), but shouldn't there be a way to do
it from the Bolt?
--
Regards,
Navin