Hi,

Thanks for reporting the issue and the demo provided by Christian!

I traced the code and think it's a bug in KafkaConsumer (see KAFKA-13563 [1]). 
We probably need to bump the Kafka client to 3.1 to fix it but we should check 
the compatilibity issue first because it’s crossing major version of Kafka (2.x 
-> 3.x). 

[1] https://issues.apache.org/jira/browse/KAFKA-13563

Best, 

Qingsheng

> On Jun 15, 2022, at 02:14, Martijn Visser <martijnvis...@apache.org> wrote:
> 
> Hi Christian,
> 
> There's another similar error reported by someone else. I've linked the 
> tickets together and asked one of the Kafka maintainers to have a look at 
> this.
> 
> Best regards,
> 
> Martijn
> 
> Op di 14 jun. 2022 om 17:16 schreef Christian Lorenz 
> <christian.lor...@mapp.com>:
> Hi Alexander,
> 
>  
> 
> I’ve created a Jira ticket here 
> https://issues.apache.org/jira/browse/FLINK-28060.
> 
> Unfortunately this is causing some issues to us.
> 
> I hope with the attached demo project the root cause of this can also be 
> determined, as this is reproducible in Flink 1.15.0, but not in Flink 1.14.4.
> 
>  
> 
> Kind regards,
> 
> Christian
> 
>  
> 
> Von: Alexander Fedulov <alexan...@ververica.com>
> Datum: Montag, 13. Juni 2022 um 23:42
> An: Christian Lorenz <christian.lor...@mapp.com>
> Cc: "user@flink.apache.org" <user@flink.apache.org>
> Betreff: Re: Kafka Consumer commit error
> 
>  
> 
> This email has reached Mapp via an external source
> 
>  
> 
> Hi Christian,
> 
>  
> 
> thanks for the reply. We use AT_LEAST_ONCE delivery semantics in this 
> application. Do you think this might still be related?
> 
>  
> 
> No, in that case, Kafka transactions are not used, so it should not be 
> relevant.
> 
>  
> 
> Best,
> 
> Alexander Fedulov
> 
>  
> 
> On Mon, Jun 13, 2022 at 3:48 PM Christian Lorenz <christian.lor...@mapp.com> 
> wrote:
> 
> Hi Alexander,
> 
>  
> 
> thanks for the reply. We use AT_LEAST_ONCE delivery semantics in this 
> application. Do you think this might still be related?
> 
>  
> 
> Best regards,
> 
> Christian
> 
>  
> 
>  
> 
> Von: Alexander Fedulov <alexan...@ververica.com>
> Datum: Montag, 13. Juni 2022 um 13:06
> An: "user@flink.apache.org" <user@flink.apache.org>
> Cc: Christian Lorenz <christian.lor...@mapp.com>
> Betreff: Re: Kafka Consumer commit error
> 
>  
> 
> This email has reached Mapp via an external source
> 
>  
> 
> Hi Christian,
> 
>  
> 
> you should check if the exceptions that you see after the broker is back from 
> maintenance are the same as the ones you posted here. If you are using 
> EXACTLY_ONCE, it could be that the later errors are caused by Kafka purging 
> transactions that Flink attempts to commit [1].
> 
>  
> 
> Best,
> 
> Alexander Fedulov
> 
> 
> [1] 
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#fault-tolerance
> 
>  
> 
> On Mon, Jun 13, 2022 at 12:04 PM Martijn Visser <martijnvis...@apache.org> 
> wrote:
> 
> Hi Christian,
> 
>  
> 
> I would expect that after the broker comes back up and recovers completely, 
> these error messages would disappear automagically. It should not require a 
> restart (only time). Flink doesn't rely on Kafka's checkpointing mechanism 
> for fault tolerance. 
> 
>  
> 
> Best regards,
> 
>  
> 
> Martijn
> 
>  
> 
> Op wo 8 jun. 2022 om 15:49 schreef Christian Lorenz 
> <christian.lor...@mapp.com>:
> 
> Hi,
> 
>  
> 
> we have some issues with a job using the flink-sql-connector-kafka (flink 
> 1.15.0/standalone cluster). If one broker e.g. is restarted for maintainance 
> (replication-factor=2), the taskmanagers executing the job are constantly 
> logging errors on each checkpoint creation:
> 
>  
> 
> Failed to commit consumer offsets for checkpoint 50659
> 
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.RetriableCommitFailedException:
>  Offset commit failed with a retriable exception. You should retry committing 
> the latest consumed offsets.
> 
> Caused by: 
> org.apache.flink.kafka.shaded.org.apache.kafka.common.errors.CoordinatorNotAvailableException:
>  The coordinator is not available.
> 
>  
> 
> AFAICT the error itself is produced by the underlying kafka consumer. 
> Unfortunately this error cannot be reproduced on our test system.
> 
> From my understanding this error might occur once, but follow up checkpoints 
> / kafka commits should be fine again.
> 
> Currently my only way of “fixing” the issue is to restart the taskmanagers.
> 
>  
> 
> Is there maybe some kafka consumer setting which would help to circumvent 
> this?
> 
>  
> 
> Kind regards,
> 
> Christian
> 
> Mapp Digital Germany GmbH with registered offices at Dachauer, Str. 63, 80335 
> München.
> Registered with the District Court München HRB 226181
> Managing Directors: Frasier, Christopher & Warren, Steve
> 
> This e-mail is from Mapp Digital and its international legal entities and may 
> contain information that is confidential or proprietary.
> If you are not the intended recipient, do not read, copy or distribute the 
> e-mail or any attachments. Instead, please notify the sender and delete the 
> e-mail and any attachments.
> Please consider the environment before printing. Thank you.
> 
> Mapp Digital Germany GmbH with registered offices at Dachauer, Str. 63, 80335 
> München.
> Registered with the District Court München HRB 226181
> Managing Directors: Frasier, Christopher & Warren, Steve
> 
> This e-mail is from Mapp Digital and its international legal entities and may 
> contain information that is confidential or proprietary.
> If you are not the intended recipient, do not read, copy or distribute the 
> e-mail or any attachments. Instead, please notify the sender and delete the 
> e-mail and any attachments.
> Please consider the environment before printing. Thank you.
> 
> Mapp Digital Germany GmbH with registered offices at Dachauer, Str. 63, 80335 
> München.
> Registered with the District Court München HRB 226181
> Managing Directors: Frasier, Christopher & Warren, Steve
> This e-mail is from Mapp Digital and its international legal entities and may 
> contain information that is confidential or proprietary.
> If you are not the intended recipient, do not read, copy or distribute the 
> e-mail or any attachments. Instead, please notify the sender and delete the 
> e-mail and any attachments.
> Please consider the environment before printing. Thank you.
> 

Reply via email to