[jira] [Comment Edited] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

Elias Dorneles (JIRA) Mon, 08 Aug 2016 13:45:46 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412430#comment-15412430
 ]


Elias Dorneles edited comment on KAFKA-3894 at 8/8/16 8:45 PM:
---------------------------------------------------------------

I've bumped the bumped into this same issue (log cleaner threads dying because 
messages wouldn't fit the offset map).

For some of the topics the messages would almost fit, so I was able to get away 
just increasing the dedupe buffer load factor 
(https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaConfig.scala#L252)
 which defaults to 90% of the 2Gb max buffer size.

For other topics that had more messages and wouldn't fit in the 2Gb in any way, 
the workaround was to:

1) decrease the segment size config for that topic [1]
2) reassign topic partitions, in order to end up with new segments with sizes 
obeying the config change
3) rolling restart the nodes, to restart log cleaner threads

I'd love to know if there is another way of doing this, step 3 is particularly 
frustrating.

Good luck!

[1]: This can be done for a particular topic with: `kafka-topics.sh --zookeeper 
$ZK --topic $TOPIC --alter --config segment.bytes`, but if needed you can also 
set `log.segment.bytes` for topics across all cluster.


was (Author: eliasdorneles):
I've bumped the bumped into this same issue (log cleaner threads dying because 
messages wouldn't fit the offset map).

For some of the topics the messages would almost fit, so I was able to get away 
just increasing the dedupe buffer load factor 
(https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaConfig.scala#L252)
 which defaults to 90% of the 2Gb max buffer size.

For other topics that had more messages and wouldn't fit in the 2Gb in any way, 
the workaround was to:

1) decrease the segment size config for that topic [1]
2) reassign topic partitions, in order to end up with new segments with sizes 
obeying the config change
3) rolling restart the nodes, to restart log cleaner threads

I'd love to know if there is another way of doing this, step 3 is particularly 
frustrating.

Good luck!

[1]: This can be done only for a particular topic with: `kafka-topics.sh 
--zookeeper $ZK --topic $TOPIC --alter --config segment.bytes`, but if needed 
you can also set `log.segment.bytes` for topics across all cluster.

> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-3894
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3894
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
>         Environment: Oracle JDK 8
> Ubuntu Precise
>            Reporter: Tim Carey-Smith
>              Labels: compaction
>             Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/00000000000047580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-3894) LogCleaner thread crashes if not even one segment can fit in the offset map

Reply via email to