Hello Arvid ,
thank you for your reply.
Actually using a window to aggregate the events for a time period is not
applicable to my case since I need the records to be processed immediately.
Even if I could I still can not understand how I could forward
the aggregated events to lets say 2 parallel o
You should create a histogram over the keys of the records. If you see a
skew, one way to go about it is to refine the key or split aggregations.
For example, consider you want to count events per users and 2 users are
actually bots spamming lots of events accounting for 50% of all events.
Then, y
Hello Qingsheng,
thank you a lot for your answer.
I will try to modify the key as you mentioned in your first assumption. In
case the second assumption is valid also, what would you propose to remedy
the situation? Try to experiment with different values of max parallelism?
Στις Σάβ 2 Απρ 2022
Hi Isidoros,
Two assumptions in my mind:
1. Records are not evenly distributed across different keys, e.g. some
accountId just has more events than others. If the record distribution is
predicable, you can try to combine other fields or include more information
into the key field to help bala
Hello,
we ran a flink application version 1.13.2 that consists of a kafka source
with one partition so far
then we filter the data based on some conditions, mapped to POJOS and we
transform to a KeyedStream based on an accountId long property from the
POJO. The downstream operators are 10 CEP oper