[ https://issues.apache.org/jira/browse/KAFKA-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17921968#comment-17921968 ]
Mehari Beyene commented on KAFKA-18570: --------------------------------------- Canceled [KIP-1126|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1126+Add+progress+metric+for+log+loading+during+Kafka+broker+startup] PR to update documentation: [https://github.com/apache/kafka/pull/18731] > Add progress metric for log loading during Kafka broker startup > ---------------------------------------------------------------- > > Key: KAFKA-18570 > URL: https://issues.apache.org/jira/browse/KAFKA-18570 > Project: Kafka > Issue Type: Improvement > Affects Versions: 4.0.0 > Reporter: Mehari Beyene > Assignee: Mehari Beyene > Priority: Major > > When a Kafka broker process starts up, it goes through the process of > restoring the state of the broker based on the segment files stored on the > disk and other auxiliary checkpoint files used to store the broker's state. > In a clean shutdown scenario, Kafka undergoes a clean shutdown, meaning all > states are persisted on the local disk, and the process of restoring the > broker's state is relatively quick (estimated under 10 minutes for a > partition count of 4000). > However, if the broker experiences an unclean shutdown, the log loading > process will also involve recovering the broker state by replaying messages > and trying to reconstruct the last known safe state of the broker. This > recovery process can take a very long time. Anecdotal data shows we have seen > processes that took more than two hours. > Log recovery is triggered as part of log loading, during this recovery > process, there is no metric that indicates the progress, leaving both Kafka > cluster administrators and customers blind to the state of the recovery. Not > having any metric that operators can use to estimate the ETA is difficult for > planning and managing expectations. > The exit criteria for this issue is to add a metric that would show the > progress of log loading when a broker starts up. -- This message was sent by Atlassian Jira (v8.20.10#820010)