[ https://issues.apache.org/jira/browse/KAFKA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345095#comment-15345095 ]
ASF GitHub Bot commented on KAFKA-3892: --------------------------------------- GitHub user iamnoah opened a pull request: https://github.com/apache/kafka/pull/1541 KAFKA-3892 prune metadata response to subscribed topics I believe this will cause clients to defensively prune their cluster metadata in all cases. It doesn't address why a client without a Pattern subscription would receive a response containing all topics and partitions for the cluster (which is still undesirable, but I am guessing would require a fix for the broker.) In my own testing, this restored the amount of heap required to 0.8 consumer levels. I am concerned that I do not 100% understand all the uses of this class. My assumption is that only topics that have been added are expected in the response and that the two unit test modifications I needed to make were oversights. I am also assuming that this behavior was only applied to the pattern matching case to avoid a small amount of (presumed) unnecessary work and not for correctness reasons. You can merge this pull request into a Git repository by running: $ git pull https://github.com/spredfast/kafka-1 remove-extra-metadata Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1541.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1541 ---- commit cb19feac9c1473e8406fd10a895a41468373ddae Author: Noah Sloan <nsl...@spredfast.com> Date: 2016-06-22T20:10:35Z KAFKA-3892 prune metadata response to subscribed topics ---- > Clients retain metadata for non-subscribed topics > ------------------------------------------------- > > Key: KAFKA-3892 > URL: https://issues.apache.org/jira/browse/KAFKA-3892 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 0.9.0.1 > Reporter: Noah Sloan > > After upgrading to 0.9.0.1 from 0.8.2 (and adopting the new consumer and > producer classes,) we noticed services with small heap crashing due to > OutOfMemoryErrors. These services contained many producers and consumers (~20 > total) and were connected to brokers with >2000 topics and over 10k > partitions. Heap dumps revealed that each client had 3.3MB of Metadata > retained in their Cluster, with references to topics that were not being > produced or subscribed to. While the services were running with 128MB of heap > prior to the upgrade, we to had increased max heap to 200MB to accommodate > all the extra data. > While this is not technically a memory leak, it does impose a significant > overhead on clients when connected to a large cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)