Hi Jia,

Zitat von xuji...@gmail.com:
Hi,


I am building a topology, in which it first needs to read some persisted data (accounts, recovery point, etc) before all the bolts can start processing tuples.


Ideally the spout starts to emit tuples only after all the required data is read into memory (maybe on spouts, maybe on bolts). What’s the general approach to deal with such use case?

I dunno about the "general approach", but I'd make the spout send an initial "special management tuple" (i.e. on a separate channel) to all bolts and wait until the ACK comes back... every bolt can initialize on that message (if not during prepare() ) and only ack the tuple once the init is done.

Alternatively I can make each bolt to ignore/fail tuples until it’s ready to process, but that means either loss of message or futile spout replays.

Doesn't sound production-level to me ;)

Regards,
Jens

Reply via email to