Err...guys....I appreciate the ongoing discussion, but the original question remains unanswered. The one I've asked at the very beginning of this conversation. Some help would be appreciated. Referring to the code I posted and as per Nathan's answer, you say that int *BoltParallelism* actually represents the tasks which are the number of instances of Bolts/Spouts? And BoltTaskParallelism is the number of executors (OS threads)? If that's the case, then execute() will get called only after the previous execute() call of a Bolt has completed. And nextTuple() will get called only after the previous nextTuple() of a Spout has completed. That's a bit reassuring, since now one does not have to cater to multithreading within a Spout/Bolt.
On Mon, May 16, 2016 at 7:07 PM, Matthias J. Sax <[email protected]> wrote: > Hi, > > > So this is not correct: > > and > > the Bolt creates a task for processing each incoming Tuple. > > Storm create exactly *BoltTaskParallelism* tasks and assigns incoming > messages to tasks (according to the used connection pattern -- shuffle, > fieldsGrouping etc). > > Futhermore: > > > If there > > are not enough tasks, then the excess Tuples are made to wait in a > > queue of the executor. > > No. There is no thing as "not enough tasks". Each task has its own input > queue/buffer and tuple are stored there. > > The executor threads process one or multiple tasks. Thus, if a task is > currently "on hold", new tuples are just added to the task's input > queue. If an executor picks up on of its tasks for processing, the > buffered tuples of the task are processed. > > > -Matthias > > On 05/16/2016 09:07 AM, Adrien Carreira wrote: > > +1 > > > > 2016-05-16 6:40 GMT+02:00 Navin Ipe <[email protected] > > <mailto:[email protected]>>: > > > > Hi, > > > > I've seen the explanations > > < > http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/ > >, > > but none of them explain it in terms of what I see in the code. This > > is what I understood: > > > > int BoltParallelism = 3; > > int BoltTaskParallelism = 2; > > builder.setBolt("bolt1", new BoltA(), *BoltParallelism*) > > .setNumTasks(*BoltTaskParallelism*) > > > > BoltParallelism creates 3 instances of BoltA. These are the > executors. > > BoltTaskParallelism allows Tuples to come into BoltA very fast, and > > the Bolt creates a task for processing each incoming Tuple. If there > > are not enough tasks, then the excess Tuples are made to wait in a > > queue of the executor. > > > > Strange thing is that the explanation says the tasks are run in a > > single thread, so obviously I misunderstood something. Could you > > help me understand it? > > > > -- > > Regards, > > Navin > > > > > > -- Regards, Navin
