[ 
https://issues.apache.org/jira/browse/FLINK-34400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818280#comment-17818280
 ] 

Rui Fan commented on FLINK-34400:
---------------------------------

Hi [~asardaes] , did you try enable idleness?

I try watermark alignment on my Mac with kafka docker, it works well.
 * Don't enable idleness: The quick topic will be blocked when slow topic 
doesn't have data. (Because quick topic source is waiting for the slow source)
 * Set idleness = 5s for slow topic: The quick topic is only blocked for 5 
seconds. After 5 seconds, it start consume.

Here is my test code, you can change a little code to simulate your case.
 * Data generator to Kafka topic:
 ** I generate some data to quick topic and slow topic.
 ** After a while, I restart it and don't write any data to slow topic. (Only 
write data to quick topic to simulate your case.)
 ** 
https://github.com/1996fanrui/fanrui-learning/blob/1069662b5fb434928c4141bf2397149cd354489b/module-flink/src/main/java/com/dream/flink/kafka/alignment/KafkaDataGenerator.java
 * Consume data from kafka:
 ** 
https://github.com/1996fanrui/fanrui-learning/blob/1069662b5fb434928c4141bf2397149cd354489b/module-flink/src/main/java/com/dream/flink/kafka/alignment/KafkaAlignmentDemo.java

 

> Kafka sources with watermark alignment sporadically stop consuming
> ------------------------------------------------------------------
>
>                 Key: FLINK-34400
>                 URL: https://issues.apache.org/jira/browse/FLINK-34400
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.18.1
>            Reporter: Alexis Sarda-Espinosa
>            Priority: Major
>         Attachments: alignment_lags.png, logs.txt
>
>
> I have 2 Kafka sources that read from different topics. I have assigned them 
> to the same watermark alignment group, and I have _not_ enabled idleness 
> explicitly in their watermark strategies. One topic remains pretty much empty 
> most of the time, while the other receives a few events per second all the 
> time. Parallelism of the active source is 2, for the other one it's 1, and 
> checkpoints are once every minute.
> This works correctly for some time (10 - 15 minutes in my case) but then 1 of 
> the active sources stops consuming, which causes lag to increase. Weirdly, 
> after another 15 minutes or so, all the backlog is consumed at once, and then 
> everything stops again.
> I'm attaching some logs from the Task Manager where the issue appears. You 
> will notice that the Kafka network client reports disconnections (a long time 
> after the deserializer stopped reporting that events were being consumed), 
> I'm not sure if this is related.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to