Is there an easy way to understand if and when my data get skewed in the
pipeline?

On Fri, Feb 5, 2016 at 4:09 PM, Stephan Ewen <se...@apache.org> wrote:

> Yes, that is definitely one possible explanation.
>
> Another one could be that there is data skew, that increased parallelism
> does not take work of the most overloaded partition (but reduces available
> memory from that partition).
> The web dashboard should actually help you with checking that.
>
>
> On Fri, Feb 5, 2016 at 3:34 PM, Flavio Pompermaier <pomperma...@okkam.it>
> wrote:
>
>> Sorry, I forgot to say that the numberOfTaskSlots is always 6..
>>
>> On Fri, Feb 5, 2016 at 3:32 PM, Flavio Pompermaier <pomperma...@okkam.it>
>> wrote:
>>
>>> Hi to all,
>>>
>>> I'm testing how to speed up my Flink job and I faced the following
>>> situations in my *6 nodes* cluster (where each node has 8 CPUs) and 1
>>> node does also the job manager:
>>>
>>> Scenario 1:
>>>
>>>    - # of network buffers 4096
>>>    - parallelism: 36
>>>    - *The job fails because I have not enough network buffers*
>>>
>>> Scenario 2:
>>>
>>>    - # of network buffers *8192*
>>>    - parallelism: 36
>>>    - *The job ends successfully in about 20 minutes *
>>>
>>> Scenario 3:
>>>
>>>    - # of network buffers *4096*
>>>    - 6 nodes
>>>    - parallelism: *6*
>>>    - *The job ends successfully in about 11 minutes*
>>>
>>> What can I infer from those results? That my job is I/O bounded thus
>>> having more threads in the same machine accessing simultaneously to the
>>> disk downgrade the performance of the pipeline?
>>>
>>> Best,
>>> Flavio
>>>
>>
>>
>

Reply via email to