We’re decompressing and deserializing several hundreds-of-megabytes files containing data (statistical classifier definitions, mostly) that the bolt needs to do its thing. The bolt can’t process events without deserializing and indexing the data in those files, which could take anything up to several minutes. This can’t easily be farmed out to an external service, due to various processing and infrastructure limitations
SimonC From: Hart, James W. [mailto:[email protected]] Sent: 23 August 2016 15:04 To: [email protected] Subject: RE: Running a long task in bolt prepare() method Can you elaborate on what kind work is being done at startup? If you are building some kind of cacheable lookup data, I would build that elsewhere in a persistent cache, like redis, and then fetch and access it through redis. From: Simon Cooper [mailto:[email protected]] Sent: Tuesday, August 23, 2016 9:36 AM To: [email protected]<mailto:[email protected]> Subject: RE: Running a long task in bolt prepare() method We’ve got a similar issue, where the prepare() takes a long time (could be up to several minutes), and the bolt can’t process tuples until that is completed. The topology seems to send in tuples before the prepare is completed, and things go wrong We’re having to implement our own mechanism for notification – an external way for the bolt to report to the spout that it is ready. This is also an issue on multi-worker topologies where one of the workers goes down, is recreated, and it’s several minutes before it can process tuples. It would be good if there was a way for storm to deal with this, so we don’t have to implement our own back-channel back to the spout… SimonC From: Andrea Gazzarini [mailto:[email protected]] Sent: 23 August 2016 13:08 To: [email protected]<mailto:[email protected]> Subject: Re: Running a long task in bolt prepare() method Not sure if there's a "built-in" approach in Storm for doint that. After make sure there isn't, I'd do the following * I'd start such long task asynchronously in the prepare method and I'd register a callback * if the execute method logic depends on the completion of such task, I'd use a basic state pattern with two states ON/OFF (where the off state is basically a NullObject). The callback would be responsible to switch the bolt state from OFF (initial state) to ON (working state) Best, Andrea On 23/08/16 09:12, Xiang Wang wrote: Hi All, I am trying to do some long-time initialisation task in bolt prepare() method in local mode. I always got error like this: WARN o.a.s.s.o.a.z.s.p.FileTxnLog - fsync-ing the write ahead log in SyncThread:0 took 1197ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide And then the task fails. Could anyone tell me how to fix this problem? Or is it a good practice to run long-time task in prepare() method? If not, what is supposed to be the correct way to do it? Many thanks for your kind help. Best, Xiang ------------------------------- Xiang Wang, PhD Candidate Database Research Group School of Computer Science and Engineering The University of New South Wales SYDNEY, AUSTRALIA This message, and any files/attachments transmitted together with it, is intended for the use only of the person (or persons) to whom it is addressed. It may contain information which is confidential and/or protected by legal privilege. Accordingly, any dissemination, distribution, copying or use of this message, or any part of it or anything sent together with it, other than by intended recipients, may constitute a breach of civil or criminal law and is hereby prohibited. Unless otherwise stated, any views expressed in this message are those of the person sending it and not the sender's employer. No responsibility, legal or otherwise, of whatever nature, is accepted as to the accuracy of the contents of this message or for the completeness of the message as received. Anyone who is not the intended recipient of this message is advised to make no use of it and is requested to contact Featurespace Limited as soon as possible. Any recipient of this message who has knowledge or suspects that it may have been the subject of unauthorised interception or alteration is also requested to contact Featurespace Limited.
