Hi,

I am looking for a reliable, production-safe strategy to avoid losing unread 
messages when a Kafka broker remains down longer than the topic's configured 
retention.ms.

Since Kafka deletes segments purely based on timestamps, if a broker is down 
for (for example) 24 hours and the topic's retention.ms is also 24 hours, the 
broker may start deleting segments immediately on startup, even if no consumers 
have read those messages yet.

Is there a recommended way to prevent message loss in this scenario?

I am running Kafka on Kubernetes using Strimzi, so all topic configurations are 
managed through KafkaTopic CRDs and the Topic Operator.

One solution could to be alter the topic's retention configuration. But for 
that to work I would need to ensure that its triggered before Kafka delete the 
log segments. So could something be done during startup?

For example, with a 3-broker cluster, I could prevent the brokers from fully 
starting after the first pod comes up, update the retention values in the 
Strimzi Kafka CR, and then let the operator complete the rollout so the cluster 
restarts with the new retention. Is this safe, or is there a better recommended 
approach to ensure that unread messages are preserved after long broker 
downtime?

Regards,
Prateek Kohli

Reply via email to