[ https://issues.apache.org/jira/browse/KAFKA-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kyle Ambroff-Kao resolved KAFKA-6468. ------------------------------------- Resolution: Fixed > Replication high watermark checkpoint file read for every LeaderAndIsrRequest > ----------------------------------------------------------------------------- > > Key: KAFKA-6468 > URL: https://issues.apache.org/jira/browse/KAFKA-6468 > Project: Kafka > Issue Type: Bug > Reporter: Kyle Ambroff-Kao > Assignee: Kyle Ambroff-Kao > Priority: Major > > The high watermark for each partition in a given log directory is written to > disk every _replica.high.watermark.checkpoint.interval.ms_ milliseconds. This > checkpoint file is used to create replicas when joining the cluster. > [https://github.com/apache/kafka/blob/b73c765d7e172de4742a3aa023d5a0a4b7387247/core/src/main/scala/kafka/cluster/Partition.scala#L180] > Unfortunately this file is read every time > kafka.cluster.Partition#getOrCreateReplica is invoked. For most clusters this > isn't a big deal, but for a small cluster with lots of partitions all of the > reads of this file really add up. > On my local test cluster of three brokers with around 40k partitions, the > initial LeaderAndIsrRequest refers to every partition in the cluster, and it > can take 20 to 30 minutes to create all of the replicas because the > _replication-offset-checkpoint_ is nearly 2MB. > Changing this code so that we only read this file once on startup reduces the > time to create all replicas to around one minute. > Credit to [~onurkaraman] for finding this one. -- This message was sent by Atlassian Jira (v8.3.4#803005)