[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365689#comment-15365689
 ] 

James Cheng commented on KAFKA-3894:
------------------------------------

#4 is a good point. By looking at the buffer size, the broker can calculate how 
large a segment it can handle, and can thus make sure to only generate segments 
that it can handle.

The comment about having a large segment that you are unable to process made me 
think about the long discussion that happened in 
https://issues.apache.org/jira/browse/KAFKA-3810. In that JIRA, a large message 
in the __consumer_offsets topic would block (internal) consumers who had too 
small of a fetch size.

The solution that was chosen and was implemented was to loosen the fetch size 
for fetches from internal topics. Internal topics would always return at least 
one message, even if the message was larger than the fetch size.

It made me wonder if it might make sense to treat the dedupe buffer in a 
similar way. In a steady state, the configured dedupe buffer size would be used 
but if it's too small to even fit a single segment, then the dedupe buffer 
would be (temporarily) grown to allow cleaning of that large segment.

CC [~junrao]


> Log Cleaner thread crashes and never restarts
> ---------------------------------------------
>
>                 Key: KAFKA-3894
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3894
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2.2, 0.9.0.1
>         Environment: Oracle JDK 8
> Ubuntu Precise
>            Reporter: Tim Carey-Smith
>              Labels: compaction
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/00000000000047580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to