Hi Jia,
Zitat von xuji...@gmail.com:
Hi,
I am building a topology, in which it first needs to read some
persisted data (accounts, recovery point, etc) before all the bolts
can start processing tuples.
Ideally the spout starts to emit tuples only after all the required
data is read into memory (maybe on spouts, maybe on bolts). What’s
the general approach to deal with such use case?
I dunno about the "general approach", but I'd make the spout send an
initial "special management tuple" (i.e. on a separate channel) to all
bolts and wait until the ACK comes back... every bolt can initialize
on that message (if not during prepare() ) and only ack the tuple once
the init is done.
Alternatively I can make each bolt to ignore/fail tuples until it’s
ready to process, but that means either loss of message or futile
spout replays.
Doesn't sound production-level to me ;)
Regards,
Jens