Hi,We have a recurring problem that I wonder if there is a better way to solve.
Currently when we restart our backed Kafka services and then our datastreams
app, the app is unable to reach many of the Kafka topic state stores. We
currently retry, but more often than not, it requires a restart of the app to
clear up.
I think this is because perhaps a partition leader has not been elected when
the app starts. Two questions
1. Is there a good way to know when a partition leader has been elected such
that our restart script can wait2. Is it possible to have the app
deterministically wait/retry until the stores are ready? As mentioned, we have
retry logic for up to 200 tries with a few seconds sleep in between, but it
seems in some cases we have restart the app as the retries get exceeded.
Thanks for any assistance