Err...guys....I appreciate the ongoing discussion, but the original
question remains unanswered. The one I've asked at the very beginning of
this conversation. Some help would be appreciated.
Referring to the code I posted and as per Nathan's answer, you say that int
*BoltParallelism* actually represents the tasks which are the number of
instances of Bolts/Spouts? And BoltTaskParallelism is the number of
executors (OS threads)?
If that's the case, then execute() will get called only after the previous
execute() call of a Bolt has completed. And nextTuple() will get called
only after the previous nextTuple() of a Spout has completed. That's a bit
reassuring, since now one does not have to cater to multithreading within a
Spout/Bolt.


On Mon, May 16, 2016 at 7:07 PM, Matthias J. Sax <[email protected]> wrote:

> Hi,
>
>
> So this is not correct:
> > and
> > the Bolt creates a task for processing each incoming Tuple.
>
> Storm create exactly *BoltTaskParallelism* tasks and assigns incoming
> messages to tasks (according to the used connection pattern -- shuffle,
> fieldsGrouping etc).
>
> Futhermore:
>
> > If there
> > are not enough tasks, then the excess Tuples are made to wait in a
> > queue of the executor.
>
> No. There is no thing as "not enough tasks". Each task has its own input
> queue/buffer and tuple are stored there.
>
> The executor threads process one or multiple tasks. Thus, if a task is
> currently "on hold", new tuples are just added to the task's input
> queue. If an executor picks up on of its tasks for processing, the
> buffered tuples of the task are processed.
>
>
> -Matthias
>
> On 05/16/2016 09:07 AM, Adrien Carreira wrote:
> > +1
> >
> > 2016-05-16 6:40 GMT+02:00 Navin Ipe <[email protected]
> > <mailto:[email protected]>>:
> >
> >     Hi,
> >
> >     I've seen the explanations
> >     <
> http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/
> >,
> >     but none of them explain it in terms of what I see in the code. This
> >     is what I understood:
> >
> >     int BoltParallelism = 3;
> >     int BoltTaskParallelism = 2;
> >     builder.setBolt("bolt1", new BoltA(), *BoltParallelism*)
> >                     .setNumTasks(*BoltTaskParallelism*)
> >
> >     BoltParallelism creates 3 instances of BoltA. These are the
> executors.
> >     BoltTaskParallelism allows Tuples to come into BoltA very fast, and
> >     the Bolt creates a task for processing each incoming Tuple. If there
> >     are not enough tasks, then the excess Tuples are made to wait in a
> >     queue of the executor.
> >
> >     Strange thing is that the explanation says the tasks are run in a
> >     single thread, so obviously I misunderstood something. Could you
> >     help me understand it?
> >
> >     --
> >     Regards,
> >     Navin
> >
> >
>
>


-- 
Regards,
Navin

Reply via email to