Hi Anirban,

> But this happened only once and now it is not getting reproduced at all.

This does make it sound a lot like
https://issues.apache.org/jira/browse/FLINK-31632.

> 1. What is the default watermarking strategy used in Flink. Can I quickly
check the default parameters being used by calling some function or so ?

Are you using the DataStream API and using the KafkaSource (and not the
older FlinkKafkaConsumer connector)? If that's the case, I believe you
always have to set a watermark strategy when adding the KafkaSource to the
job topology, so there isn't a default.

If you're using SQL, the strategy would be noWatermarks if not specified,
which shouldn't let you bump into the bug.

> 2. Are there any other conditions in which a source can be marked as Idle
apart from the watermarking issue mentioned below ?

That bug is the only known cause of incorrect permanent idleness that I'm
aware of.

> 3. If a Flink source is marked as Idle, is there any way to make it
active without having to re-submit the Flink Job ?

With the currently supported watermark strategies, the only way for a
source subtask to resume activeness is when it reports a new watermark
(e.g. when new data is produced).

Best,
Gordon

On Fri, Jun 16, 2023 at 7:10 AM Anirban Dutta Gupta <
anir...@indicussoftware.com> wrote:

> Hello All,
>
> Sorry to be replying to an existing thread for my question. Actually we
> are also facing the issue of the Flink Kafka source stopping consuming
> messages completely.
> It only started consuming messages after we re-submitted the Job. But this
> happened only once and now it is not getting reproduced at all.
> We are not using any watermarking strategy in specific.
>
> I have a few questions:
> 1. What is the default watermarking strategy used in Flink. Can I quickly
> check the default parameters being used by calling some function or so ?
> 2. Are there any other conditions in which a source can be marked as Idle
> apart from the watermarking issue mentioned below ?
> 3. If a Flink source is marked as Idle, is there any way to make it active
> without having to re-submit the Flink Job ?
>     Or Is it that the source automatically becomes active after a certain
> duration ?
>
> Many thanks in advance,
> Anirban
>
> On 16-06-2023 02:27, Ken Krugler wrote:
>
> I think you’re hitting this issue:
>
> https://issues.apache.org/jira/browse/FLINK-31632
>
> Fixed in 1.16.2, 1.171.
>
> — Ken
>
>
> On Jun 15, 2023, at 1:39 PM, Piotr Domagalski <pi...@domagalski.com>
> wrote:
>
> Hi all!
>
> We've been experimenting with watermark alignment in Flink 1.15 and
> observed an odd behaviour that I couldn't find any mention of in the
> documentation.
>
> With the following strategy:
>
> WatermarkStrategy.<Event>forBoundedOutOfOrderness(Duration.ofSeconds(60))
> .withTimestampAssigner((e, t) -> e.timestamp) .withIdleness(Duration.
> ofSeconds(3600)) .withWatermarkAlignment("group-1", Duration.ofSeconds(15
> ));
>
> Kafka sources stop consuming completely after 3600s (even when the data is
> flowing into all the partitions). Is this an expected behaviour? Where
> could I find more information on this?
>
> --
> Piotr Domagalski
>
>
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com
> Custom big data solutions
> Flink, Pinot, Solr, Elasticsearch
>
>
>
>
>

Reply via email to