Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-16 Thread Lars Skjærven
Thanks for the feedback. > May I ask why you have less partitions than the parallelism? I would be happy to learn more about your use-case to better understand the > motivation. The use case is that topic A, contains just a few messages with product metadata that rarely gets updated, while topic

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-16 Thread Fabian Paul
Hi all, The problem you are seeing Lars is somewhat intended behaviour, unfortunately. With the batch/stream unification every Kafka partition is treated as kind of workload assignment. If one subtask receives a signal that there is no workload anymore it goes into the FINISHED state. As alread

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread David Morávek
If we are shutting down any sources of unbounded jobs that run on Flink versions without FLIP-147 (available in 1.14) [1], that Matthias has mentioned, than it's IMO a bug, because it effectively breaks checkpointing. Fabian, can you please verify whether this is an intended behavior? In the meant

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Lars Skjærven
Got it. So the workaround for now (1.13.2) is to fall back to FlinkKafkaConsumer if I read you correctly. Thanks L On Wed, Sep 15, 2021 at 2:58 PM Matthias Pohl wrote: > Hi Lars, > I guess you are looking > for execution.checkpointing.checkpoints-after-tasks-finish.enabled [1]. > This configurat

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Matthias Pohl
Hi Lars, I guess you are looking for execution.checkpointing.checkpoints-after-tasks-finish.enabled [1]. This configuration parameter is going to be introduced in the upcoming Flink 1.14 release. Best, Matthias [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#execu

KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Lars Skjærven
Using KafkaSource builder with a job parallelism larger than the number of kafka partitions, the job is unable to checkpoint. With a job parallelism of 4, 3 of the tasks are marked as FINISHED for the kafka topic with one partition. For this reason checkpointing seems to be disabled. When using F