> > For the partition > idleness problem could you elaborate more about it? I assume both > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide > whether to mark the partition as idle.
As a matter of fact, no, that's not the case -- which is why I mentioned it. The FlinkKafkaConsumer automatically treats all initially empty (or non-existent) partitions as idle, while the KafkaSource only does this if the WatermarkStrategy specifies that idleness handling is desired by configuring withIdleness. This can be a source of confusion for folks upgrading to the new connector. It most often shows up in situations where the number of Kafka partitions is less than the parallelism of the connector, which is a rather common occurrence in development and testing environments. I believe this change in behavior was made deliberately, so as to create a more consistent experience across all FLIP-27 connectors. This isn't something that needs to be fixed, but does need to be communicated more clearly. Unfortunately, the whole idleness mechanism remained significantly broken until 1.16 (considering the impact of [1] and [2]), further complicating the situation. Because of FLINK-28975 [2], users with partitions that are initially empty may have problems with versions before 1.15.3 (still unreleased) and 1.16.0. See [3] for an example of this confusion. [1] https://issues.apache.org/jira/browse/FLINK-18934 (idleness didn't work with connected streams) [2] https://issues.apache.org/jira/browse/FLINK-28975 (idle streams could never become active again) [3] https://stackoverflow.com/questions/70096166/parallelism-in-flink-kafka-source-causes-nothing-to-execute/70101290#70101290 Best, David On Wed, Nov 2, 2022 at 5:26 AM Qingsheng Ren <re...@apache.org> wrote: > Thanks Jing for starting the discussion. > > +1 for removing FlinkKafkaConsumer, as KafkaSource has evolved for many > release cycles and should be stable enough. I have some concerns about the > new Kafka sink based on sink v2, as sink v2 still has some ongoing work in > 1.17 (maybe Yun Gao could provide some inputs). Also we found some issues > of KafkaSink related to the internal mechanism of sink v2, like > FLINK-29492. > > @David > About the ability of DeserializationSchema#isEndOfStream, FLIP-208 is > trying to complete this piece of the puzzle, and Hang Ruan ( > ruanhang1...@gmail.com) plans to work on it in 1.17. For the partition > idleness problem could you elaborate more about it? I assume both > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide > whether to mark the partition as idle. > > Best, > Qingsheng > Ververica (Alibaba) > > On Thu, Oct 27, 2022 at 8:06 PM Jing Ge <j...@ververica.com> wrote: > > > Hi Dev, > > > > I'd like to start a discussion about removing FlinkKafkaConsumer and > > FlinkKafkaProducer in 1.17. > > > > Back in the past, it was originally announced to remove it with Flink > 1.15 > > after Flink 1.14 had been released[1]. And then postponed to the next > 1.15 > > release which meant to remove it with Flink 1.16 but forgot to change the > > doc[2]. I have created a PRs to fix it. Since the 1.16 release branch has > > code freeze, it makes sense to, first of all, update the doc to say that > > FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and second > start > > the discussion about removing them with the current master branch i.e. > for > > the coming 1.17 release. I'm all ears and looking forward to your > feedback. > > Thanks! > > > > Best regards, > > Jing > > > > > > > > > > > > > > > > > > > > [1] > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#kafka-sourcefunction > > [2] > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#kafka-sourcefunction > > [3] https://github.com/apache/flink/pull/21172 > > [4] https://github.com/apache/flink/pull/21171 > > >