Hi Aljoscha,
The latency is measured with Flink MetricGroup (specifically with 
"DropwizardHistogram").
The latency is measured from message read time (i.e. from when the message is 
pulled from Kafka source) until the last operator completes the 
processing(there is no Kafka sink).

Thanks,
Liron

From: Aljoscha Krettek [mailto:aljos...@apache.org]
Sent: Wednesday, January 03, 2018 3:03 PM
To: Netzer, Liron [ICG-IT]
Cc: user@flink.apache.org
Subject: Re: Lower Parallelism derives better latency

Hi,

How are you measuring latency? Is it latency within a Flink Job or from Kafka 
to Kafka? The first seems more likely but I'm still interested in the details.

Best,
Aljoscha


On 3. Jan 2018, at 08:13, Netzer, Liron 
<liron.net...@citi.com<mailto:liron.net...@citi.com>> wrote:

Hi group,

We have a standalone Flink cluster that is running on a UNIX host with 40 CPUs 
and 256GB RAM.
There is one task manager and 24 slots were defined.
When we decrease the parallelism of the Stream graph operators(each operator 
has the same parallelism),
we see a consistent change in the latency, it gets better:
Test run

Parallelism

99 percentile

95 percentile

75 percentile

Mean

#1

8

4.15 ms

2.02 ms

0.22 ms

0.42 ms

#2

7

3.6 ms

1.68 ms

0.14 ms

0.34 ms

#3

6

3 ms

1.4 ms

0.13 ms

0.3 ms

#4

5

2.1 ms

0.95 ms

0.1 ms

0.22 ms

#5

4

1.5 ms

0.64 ms

0.09 ms

0.16 ms


This was a surprise for us, as we expected that higher parallelism will derive 
better latency.
Could you try to assist us to understand this behavior?
I know that when there are more threads that are involved, there is probably 
more serialization/deserialization, but this can't be the only reason for this 
behavior.

We have two Kafka sources, and the rest of the operators are fixed windows, 
flatmaps, coMappers and several KeyBys.
Except for the Kafka sources and some internal logging, there is no other I/O 
(i.e. we do not connect to any external DB/queue)
We use Flink 1.3.


Thanks,
Liron

Reply via email to