[ 
https://issues.apache.org/jira/browse/KAFKA-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15836543#comment-15836543
 ] 

Vincent Rischmann commented on KAFKA-3894:
------------------------------------------

No not all logs, just the 00000000000000000000.log one. If you notice, Kafka 
computes the number of messages based on the number in the filename, hence why 
it reports there is 13042136566 messages in your log, which is almost surely 
not true. At least it wasn't for me.

The file name is just wrong basically. Come to think of it, you could maybe 
just rename the file to some arbitrary number where you know the difference 
between the _next_  segment number and _this_ segment buffer is something that 
would fit in your dedupe buffer ? For example, here your second segment has the 
number _13042136566_, you could rename the 00000000000000000000.log to 
_13042136566 - 1000000_ then your offset map only needs to fit 1M offsets which 
it can do based on your log.

I'm just thinking out loud here, I didn't do this but I think it could work, 
and would be less risky than just deleting all data, maybe.

> LogCleaner thread crashes if not even one segment can fit in the offset map
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-3894
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3894
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
>         Environment: Oracle JDK 8
> Ubuntu Precise
>            Reporter: Tim Carey-Smith
>            Assignee: Tom Crayford
>              Labels: compaction
>             Fix For: 0.10.1.0
>
>
> The log-cleaner thread can crash if the number of keys in a topic grows to be 
> too large to fit into the dedupe buffer. 
> The result of this is a log line: 
> {quote}
> broker=0 pri=ERROR t=kafka-log-cleaner-thread-0 at=LogCleaner 
> \[kafka-log-cleaner-thread-0\], Error due to  
> java.lang.IllegalArgumentException: requirement failed: 9750860 messages in 
> segment MY_FAVORITE_TOPIC-2/00000000000047580165.log but offset map can fit 
> only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
> log.cleaner.threads
> {quote}
> As a result, the broker is left in a potentially dangerous situation where 
> cleaning of compacted topics is not running. 
> It is unclear if the broader strategy for the {{LogCleaner}} is the reason 
> for this upper bound, or if this is a value which must be tuned for each 
> specific use-case. 
> Of more immediate concern is the fact that the thread crash is not visible 
> via JMX or exposed as some form of service degradation. 
> Some short-term remediations we have made are:
> * increasing the size of the dedupe buffer
> * monitoring the log-cleaner threads inside the JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to