on't think
> it's very probable for 2 different countries to have the same hash, but I
> know for a fact that the number of events is not evenly distributed between
> countries.
>
> But still, why does the impact in performance appear only for higher
> parallelism?
>
>
&
d Heise
Sent: Tuesday, November 3, 2020 8:54 PM
To: Sidney Feiner
Cc: Yangze Guo ; user@flink.apache.org
Subject: Re: Increase in parallelism has very bad impact on performance
Hi Sidney,
you might recheck your first message. Either it's incorrectly written or you
are a victim of a fa
ormance appear only for higher
> parallelism?
>
>
> *Sidney Feiner* */* Data Platform Developer
> M: +972.528197720 */* Skype: sidney.feiner.startapp
>
> [image: emailsignature]
>
> ----------
> *From:* Arvid Heise
> *Sent:* Tuesday, November 3, 2020 12:09 PM
>
angze Guo
Cc: Sidney Feiner ; user@flink.apache.org
Subject: Re: Increase in parallelism has very bad impact on performance
Hi Sidney,
there could be a couple of reasons where scaling actually hurts. Let's include
them one by one.
First, you need to make sure that your source actually s
: sidney.feiner.startapp
[emailsignature]
From: Yangze Guo
Sent: Tuesday, November 3, 2020 5:00 AM
To: Sidney Feiner
Cc: user@flink.apache.org
Subject: Re: Increase in parallelism has very bad impact on performance
Hi, Sidney,
What is the data generation rate of your
Hi Sidney,
there could be a couple of reasons where scaling actually hurts. Let's
include them one by one.
First, you need to make sure that your source actually supports scaling.
Thus, your Kafka topic needs at least as many partitions as you want to
scale. So if you want to scale at some point
Hi, Sidney,
What is the data generation rate of your Kafka topic? Is it a lot
bigger than 6000?
Best,
Yangze Guo
Best,
Yangze Guo
On Tue, Nov 3, 2020 at 8:45 AM Sidney Feiner wrote:
>
> Hey,
> I'm writing a Flink app that does some transformation on an event consumed
> from Kafka and then cr
Hey,
I'm writing a Flink app that does some transformation on an event consumed from
Kafka and then creates time windows keyed by some field, and apply an
aggregation on all those events.
When I run it with parallelism 1, I get a throughput of around 1.6K events per
second (so also 1.6K events p