>
> For the partition
> idleness problem could you elaborate more about it? I assume both
> FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide
> whether to mark the partition as idle.


As a matter of fact, no, that's not the case -- which is why I mentioned it.

The FlinkKafkaConsumer automatically treats all initially empty (or
non-existent) partitions as idle, while the KafkaSource only does this if
the WatermarkStrategy specifies that idleness handling is desired by
configuring withIdleness. This can be a source of confusion for folks
upgrading to the new connector. It most often shows up in situations where
the number of Kafka partitions is less than the parallelism of the
connector, which is a rather common occurrence in development and testing
environments.

I believe this change in behavior was made deliberately, so as to create a
more consistent experience across all FLIP-27 connectors. This isn't
something that needs to be fixed, but does need to be communicated more
clearly. Unfortunately, the whole idleness mechanism remained significantly
broken until 1.16 (considering the impact of [1] and [2]), further
complicating the situation. Because of FLINK-28975 [2], users with
partitions that are initially empty may have problems with versions before
1.15.3 (still unreleased) and 1.16.0. See [3] for an example of this
confusion.

[1] https://issues.apache.org/jira/browse/FLINK-18934 (idleness didn't work
with connected streams)
[2] https://issues.apache.org/jira/browse/FLINK-28975 (idle streams could
never become active again)
[3]
https://stackoverflow.com/questions/70096166/parallelism-in-flink-kafka-source-causes-nothing-to-execute/70101290#70101290

Best,
David

On Wed, Nov 2, 2022 at 5:26 AM Qingsheng Ren <re...@apache.org> wrote:

> Thanks Jing for starting the discussion.
>
> +1 for removing FlinkKafkaConsumer, as KafkaSource has evolved for many
> release cycles and should be stable enough. I have some concerns about the
> new Kafka sink based on sink v2, as sink v2 still has some ongoing work in
> 1.17 (maybe Yun Gao could provide some inputs). Also we found some issues
> of KafkaSink related to the internal mechanism of sink v2, like
> FLINK-29492.
>
> @David
> About the ability of DeserializationSchema#isEndOfStream, FLIP-208 is
> trying to complete this piece of the puzzle, and Hang Ruan (
> ruanhang1...@gmail.com) plans to work on it in 1.17. For the partition
> idleness problem could you elaborate more about it? I assume both
> FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide
> whether to mark the partition as idle.
>
> Best,
> Qingsheng
> Ververica (Alibaba)
>
> On Thu, Oct 27, 2022 at 8:06 PM Jing Ge <j...@ververica.com> wrote:
>
> > Hi Dev,
> >
> > I'd like to start a discussion about removing FlinkKafkaConsumer and
> > FlinkKafkaProducer in 1.17.
> >
> > Back in the past, it was originally announced to remove it with Flink
> 1.15
> > after Flink 1.14 had been released[1]. And then postponed to the next
> 1.15
> > release which meant to remove it with Flink 1.16 but forgot to change the
> > doc[2]. I have created a PRs to fix it. Since the 1.16 release branch has
> > code freeze, it makes sense to, first of all, update the doc to say that
> > FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and second
> start
> > the discussion about removing them with the current master branch i.e.
> for
> > the coming 1.17 release. I'm all ears and looking forward to your
> feedback.
> > Thanks!
> >
> > Best regards,
> > Jing
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > [1]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#kafka-sourcefunction
> > [2]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#kafka-sourcefunction
> > [3] https://github.com/apache/flink/pull/21172
> > [4] https://github.com/apache/flink/pull/21171
> >
>

Reply via email to