Hi,

 

We’ve tried to restart with savepoint 2 different jobs:

  1. FlinkKafkaProducer -> KafkaSink with a new UID on it and –allowNonRestoredState flag to reset the state of the sink operator.
  2. KafkaSink -> KafkaSink with a new UID on it and –allowNonRestoredState flag to reset the state of the sink operator.

 

For both cases we changed the UID of the kafka sink to make sure that its state resets. However, we did it via savepoint to keep the source operator state (no data duplication/loss allowed).

 

The problem is that for both cases the job couldn’t checkpoint anymore. Each checkpoint failed after the configured timeout (in our case 3 minutes). Normally, before restart, checkpoints took under 1 second. I’ve tried to increase the timeout but it did not make any difference and it was clearly because of the Kafka sink.

 

I have observed a lot of logs like this (not sure if they are related to the issue):

2024-09-03 08:12:09,550 INFO  org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=producer-fk8s-480dcb71187e8ab619944412e95cb04e22388b17-20211116144116-0-3187, transactionalId=fk8s-480dcb71187e8ab619944412e95cb04e22388b17-20211116144116-0-3187] Invoking InitProducerId for the first time in order to acquire a producer ID

 

2024-09-03 08:12:09,552 INFO  org.apache.kafka.clients.Metadata                            [] - [Producer clientId=producer-fk8s-480dcb71187e8ab619944412e95cb04e22388b17-

20211116144116-0-3187, transactionalId=fk8s-480dcb71187e8ab619944412e95cb04e22388b17-20211116144116-0-3187] Cluster ID: LtOP7cS0SOis0BcZNqaPJA

 

2024-09-03 08:12:09,552 INFO  org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=producer-fk8s-480dcb71187e8ab619944412e95cb04e22388b17-20211116144116-0-3187, transactionalId=fk8s-480dcb71187e8ab619944412e95cb04e22388b17-20211116144116-0-3187] Discovered transaction coordinator ec2-63-32-61-53.eu-west-1.compute.amazonaws.com:9092 (id: 1010 rack: null)

 

Kafka server version - kafka_2.12-2.6.0

Flink Kafka connector version - 3.1.0-1.18

Kafka client version - org.apache.kafka:kafka-clients:jar:3.4.0

 

 

Cheers,
Vadim.

Reply via email to