[ https://issues.apache.org/jira/browse/KAFKA-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171293#comment-15171293 ]
ASF GitHub Bot commented on KAFKA-3300: --------------------------------------- GitHub user becketqin opened a pull request: https://github.com/apache/kafka/pull/983 KAFKA-3300: Avoid over allocating disk space and memory for index files. You can merge this pull request into a Git repository by running: $ git pull https://github.com/becketqin/kafka KAFKA-3300 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/983.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #983 ---- commit b49a9af4c19513e458ced92ef49504f7a1c237df Author: Jiangjie Qin <becket....@gmail.com> Date: 2016-02-29T01:39:18Z KAFKA-3300: Avoid over allocating disk space and memory for index files. ---- > Calculate the initial/max size of offset index files and reduce the memory > footprint for memory mapped index files. > ------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-3300 > URL: https://issues.apache.org/jira/browse/KAFKA-3300 > Project: Kafka > Issue Type: Improvement > Affects Versions: 0.9.0.1 > Reporter: Jiangjie Qin > Assignee: Jiangjie Qin > Fix For: 0.10.0.0 > > > Currently the initial/max size of offset index file is configured by > {{log.index.max.bytes}}. This will be the offset index file size for active > log segment until it rolls out. > Theoretically, we can calculate the upper bound of offset index size using > the following formula: > {noformat} > log.segment.bytes / index.interval.bytes * 8 > {noformat} > With default setting the bytes needed for an offset index size is 1GB / 4K * > 8 = 2MB. And the default log.index.max.bytes is 10MB. > This means we are over-allocating at least 8MB on disk and mapping it to > memory. > We can probably do the following: > 1. When creating a new offset index, calculate the size using the above > formula, > 2. If the result in (1) is greater than log.index.max.bytes, we allocate > log.index.max.bytes instead. > This should be able to significantly save memory if a broker has a lot of > partitions on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)