Hi,

According to the documentation :
*"**Each task is executed by one thread ,**Chaining operators together into
tasks is a useful optimization: it reduces the overhead of thread-to-thread
handover and buffering, and increases overall throughput while decreasing
latency"*
So does it mean that the single box (refer below mails) represent it
as a *single
task* and  the task will be executed by single thread only ?

I am having 8 node cluster (parallelism set to 56), so what is the correct
way to achieve maximum CPU utilization and parallelism ? Does complete
stream chaining into a single box achieve maximum parallelism ?

The data we are processing is huge volume of data (60,000 records per
second), so wanted to be sure what we can correct to achieve better
results.

Regards,
Vinay Patil


On Fri, Jul 1, 2016 at 9:23 PM, Aljoscha Krettek <aljos...@apache.org>
wrote:

> Hi,
> yes, the window operator is stateful, which means that it will pick up
> where it left in case of a failure and restore.
>
> You're right about the graph, chained operators are shown as one box.
>
> Cheers,
> Aljoscha
>
> On Fri, 1 Jul 2016 at 04:52 Vinay Patil <vinay18.pa...@gmail.com> wrote:
>
> > Hi,
> >
> > Just watched the video on Robust Stream Processing .
> > So when we say Window is a stateful operator , does it mean that even if
> > the task manager doing the window operation fails,  will it pick up from
> > the state left earlier when it comes up ? (Have not read more on state
> for
> > now)
> >
> >
> > Also in one of our project when we deploy on cluster and check the Job
> > Graph , everything is shown in one box , why this happens ? Is it because
> > of chaining of streams ?
> > So the box here represent the function flow, right ?
> >
> >
> >
> > Regards,
> > Vinay Patil
> >
>

Reply via email to