You might want to check out Storm Signals.
https://github.com/ptgoetz/storm-signals

It might give you what you're looking for.

On Sat, May 7, 2016, 11:59 AM Matthias J. Sax <[email protected]> wrote:

> As you mentioned already: Storm is designed to run topologies forever ;)
> If you have finite data, why do you not use a batch processor???
>
> As a workaround, you can embed "control messages" in your stream (or use
> an additional stream for them).
>
> If you want a topology to shut down itself, you could use
> `NimbusClient.getConfiguredClient(conf).getClient().killTopology(name);`
> in your spout/bolt code.
>
> Something like:
>  - Spout emit all tuples
>  - Spout emit special "end of stream" control tuple
>  - Bolt1 processes everything
>  - Bolt1 forward "end of stream" control tuple (when it received it)
>  - Bolt2 processed everything
>  - Bolt2 receives "end of stream" control tuple => flush to DB => kill
> topology
>
> But I guess, this is kinda weird pattern.
>
> -Matthias
>
> On 05/05/2016 06:13 AM, Navin Ipe wrote:
> > Hi,
> >
> > I know Storm is designed to run forever. I also know about Trident's
> > technique of aggregation. But shouldn't Storm have a way to let bolts
> > know that a certain bunch of processing has been completed?
> >
> > Consider this topology:
> > Spout------>Bolt-A------>Bolt-B
> >             |                  |--->Bolt-B
> >             |                  \--->Bolt-B
> >             |--->Bolt-A------>Bolt-B
> >             |                  |--->Bolt-B
> >             |                  \--->Bolt-B
> >             \--->Bolt-A------>Bolt-B
> >                                |--->Bolt-B
> >                                \--->Bolt-B
> >
> >   * From Bolt-A to Bolt-B, it is a FieldsGrouping.
> >   * Spout emits only a few tuples and then stops emitting.
> >   * Bolt A takes those tuples and generates millions of tuples.
> >
> >
> > *Bolt-B accumulates tuples that Bolt A sends and needs to know when
> > Spout finished emitting. Only then can Bolt-B start writing to SQL.*
> >
> > *Questions:*
> > 1. How can all Bolts B be notified that it is time to write to SQL?
> > 2. After all Bolts B have written to SQL, how to know that all Bolts B
> > have completed writing?
> > 3. How to stop the topology? I know of
> > localCluster.killTopology("HelloStorm"), but shouldn't there be a way to
> > do it from the Bolt?
> >
> > --
> > Regards,
> > Navin
>
>

Reply via email to