Shuyi Chen created FLINK-37366:
----------------------------------

             Summary: Allow configurable retry for Kafka topic metadata fetch
                 Key: FLINK-37366
                 URL: https://issues.apache.org/jira/browse/FLINK-37366
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Kafka
            Reporter: Shuyi Chen


For high availability, we adopted a multi-primary Kafka cluster setup, so the 
data of a Kafka topic will be in multiple physical clusters. In case of a kafka 
cluster failure, Flink pipeline should  continue to run w/o failure. Currently, 
Flink pipeline will fail due to SubscriberUtils.getTopicMetadata() throwing 
RuntimeException if a kafka cluster fails, causing the pipeline keep 
restarting. We propose to add a configurable retry policy in 
SubscriberUtils.getTopicMetadata(), so we can configure flink Kafka connector 
to tolerate kafka failure for longer period of time w/o restarting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to