Re: [DISCUSS] FLIP-208: Update KafkaSource to detect EOF based on de-serialized record

Dong Lin Fri, 07 Jan 2022 07:50:01 -0800

Hi Martijn,

Thanks for the comments! In general I agree we should avoid feature
sparsity.

In this particular case, connectors are a bit different than most other
features in Flink. AFAIK, we plan to move connectors (including Kafka and
Pulsar) out of the Flink project in the future, which means that the
development of these connectors will be mostly de-centralized (outside of
Flink) and be up to their respective maintainers. While I agree that we
should provide API/infrastructure in Flink (as this FLIP does) to support
feature consistency across connectors, I am not sure we should own the
responsibility to actually update all connectors to achieve feature
consistency, given that we don't plan to do it in Flink anyway due to its
heavy burden.

With that being said, I am happy to follow the community guideline if we
decide that connector-related FLIP should update every connector's API to
ensure feature consistency (to a reasonable extent). For example, in this
particular case, it looks like the EOF-detection feature can be applied to
every connector (including bounded sources). Is it still sufficient to just
update Kafka, Pulsar and Kinesis?

Thanks,
Dong

On Tue, Jan 4, 2022 at 3:31 PM Martijn Visser <[email protected]> wrote:

> Hi Dong,
>
> Thanks for writing the FLIP. It focusses only on the KafkaSource, but I
> would expect that if such a functionality is desired, it should be made
> available for all unbounded sources (for example, Pulsar and Kinesis). If
> it's only available for Kafka, I see it as if we're increasing feature
> sparsity while we actually want to decrease that. What do you think?
>
> Best regards,
>
> Martijn
>
> On Tue, 4 Jan 2022 at 08:04, Dong Lin <[email protected]> wrote:
>
> > Hi all,
> >
> > We created FLIP-208: Update KafkaSource to detect EOF based on
> > de-serialized records. Please find the KIP wiki in the link
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-208%3A+Update+KafkaSource+to+detect+EOF+based+on+de-serialized+records
> > .
> >
> > This FLIP aims to address the use-case where users need to stop a Flink
> job
> > gracefully based on the content of de-serialized records observed in the
> > KafkaSource. This feature is needed by users who currently depend on
> > KafkaDeserializationSchema::isEndOfStream() to migrate their Flink job
> from
> > FlinkKafkaConsumer to KafkaSource.
> >
> > Could you help review this FLIP when you get time? Your comments are
> > appreciated!
> >
> > Cheers,
> > Dong
> >
>

Re: [DISCUSS] FLIP-208: Update KafkaSource to detect EOF based on de-serialized record

Reply via email to