Hi all, Thank you all for the informative feedback. I figure there is a requirement to improve the documentation wrt the migration from FlinkKafkaConsumer to KafkaSource. I've fired a ticket[1] and connected it with [2]. This shouldn't be the blocker for removing FlinkKafkaConsumer.
Given there will be some ongoing SinkV2 upgrades, I will start a vote only limited to FlinkKafkaConsumer elimination and related APIs graduation. As a follow-up task, I will sync with Yun Gao before the coding freeze of 1.17 release to check if we can start the second vote to remove FlinkKafkaProducer with 1.17. Best regards, Jing [1] https://issues.apache.org/jira/browse/FLINK-29999 [2] https://issues.apache.org/jira/browse/FLINK-28302 On Wed, Nov 2, 2022 at 11:39 AM Martijn Visser <martijnvis...@apache.org> wrote: > Hi David, > > I believe that for the DataStream this is indeed documented [1] but it > might be missed given that there is a lot of documentation and you need to > know that your problem is related to idleness. For the Table API I think > this is never mentioned, so it should definitely be at least documented > there. > > Thanks, > > Martijn > > [1] > > https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#idleness > > On Wed, Nov 2, 2022 at 11:28 AM David Anderson <dander...@apache.org> > wrote: > > > > > > > For the partition > > > idleness problem could you elaborate more about it? I assume both > > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide > > > whether to mark the partition as idle. > > > > > > As a matter of fact, no, that's not the case -- which is why I mentioned > > it. > > > > The FlinkKafkaConsumer automatically treats all initially empty (or > > non-existent) partitions as idle, while the KafkaSource only does this if > > the WatermarkStrategy specifies that idleness handling is desired by > > configuring withIdleness. This can be a source of confusion for folks > > upgrading to the new connector. It most often shows up in situations > where > > the number of Kafka partitions is less than the parallelism of the > > connector, which is a rather common occurrence in development and testing > > environments. > > > > I believe this change in behavior was made deliberately, so as to create > a > > more consistent experience across all FLIP-27 connectors. This isn't > > something that needs to be fixed, but does need to be communicated more > > clearly. Unfortunately, the whole idleness mechanism remained > significantly > > broken until 1.16 (considering the impact of [1] and [2]), further > > complicating the situation. Because of FLINK-28975 [2], users with > > partitions that are initially empty may have problems with versions > before > > 1.15.3 (still unreleased) and 1.16.0. See [3] for an example of this > > confusion. > > > > [1] https://issues.apache.org/jira/browse/FLINK-18934 (idleness didn't > > work > > with connected streams) > > [2] https://issues.apache.org/jira/browse/FLINK-28975 (idle streams > could > > never become active again) > > [3] > > > > > https://stackoverflow.com/questions/70096166/parallelism-in-flink-kafka-source-causes-nothing-to-execute/70101290#70101290 > > > > Best, > > David > > > > On Wed, Nov 2, 2022 at 5:26 AM Qingsheng Ren <re...@apache.org> wrote: > > > > > Thanks Jing for starting the discussion. > > > > > > +1 for removing FlinkKafkaConsumer, as KafkaSource has evolved for many > > > release cycles and should be stable enough. I have some concerns about > > the > > > new Kafka sink based on sink v2, as sink v2 still has some ongoing work > > in > > > 1.17 (maybe Yun Gao could provide some inputs). Also we found some > issues > > > of KafkaSink related to the internal mechanism of sink v2, like > > > FLINK-29492. > > > > > > @David > > > About the ability of DeserializationSchema#isEndOfStream, FLIP-208 is > > > trying to complete this piece of the puzzle, and Hang Ruan ( > > > ruanhang1...@gmail.com) plans to work on it in 1.17. For the partition > > > idleness problem could you elaborate more about it? I assume both > > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide > > > whether to mark the partition as idle. > > > > > > Best, > > > Qingsheng > > > Ververica (Alibaba) > > > > > > On Thu, Oct 27, 2022 at 8:06 PM Jing Ge <j...@ververica.com> wrote: > > > > > > > Hi Dev, > > > > > > > > I'd like to start a discussion about removing FlinkKafkaConsumer and > > > > FlinkKafkaProducer in 1.17. > > > > > > > > Back in the past, it was originally announced to remove it with Flink > > > 1.15 > > > > after Flink 1.14 had been released[1]. And then postponed to the next > > > 1.15 > > > > release which meant to remove it with Flink 1.16 but forgot to change > > the > > > > doc[2]. I have created a PRs to fix it. Since the 1.16 release branch > > has > > > > code freeze, it makes sense to, first of all, update the doc to say > > that > > > > FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and second > > > start > > > > the discussion about removing them with the current master branch > i.e. > > > for > > > > the coming 1.17 release. I'm all ears and looking forward to your > > > feedback. > > > > Thanks! > > > > > > > > Best regards, > > > > Jing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#kafka-sourcefunction > > > > [2] > > > > > > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#kafka-sourcefunction > > > > [3] https://github.com/apache/flink/pull/21172 > > > > [4] https://github.com/apache/flink/pull/21171 > > > > > > > > > >