[ 
https://issues.apache.org/jira/browse/KAFKA-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385866#comment-15385866
 ] 

ASF GitHub Bot commented on KAFKA-3915:
---------------------------------------

GitHub user ijuma opened a pull request:

    https://github.com/apache/kafka/pull/1643

    KAFKA-3915; Don't convert messages from v0 to v1 during log compaction

    The conversion is unsafe as the converted message size may be greater
    than the message size limit. Updated `LogCleanerIntegrationTest` to test 
the max message size case for both V0 and the current version.
    
    Also include a few minor clean-ups:
    * Remove unused expression
    * Avoid unintentional usage of `scala.collection.immutable.Stream` (`toSeq` 
on an `Iterator`)
    * Add explicit result type in `FileMessageSet.iterator`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ijuma/kafka 
kafka-3915-log-cleaner-io-buffers-message-conversion

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/1643.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1643
    
----
commit 083029d12a12ca0e750dddf08a4ce8f4ec5db8bb
Author: Ismael Juma <ism...@juma.me.uk>
Date:   2016-07-20T13:34:04Z

    Don't convert messages from version 0 to version 1 during log compaction
    
    The conversion is unsafe as the converted message size may be greater
    than the message size limit.

commit 1262c2f87f6dd65c8624dde7f3406de7ab00cb99
Author: Ismael Juma <ism...@juma.me.uk>
Date:   2016-07-20T13:35:47Z

    Remove unused expression, avoid usage of scala.Stream and use explicit 
return type for public method

----


> LogCleaner IO buffers do not account for potential size difference due to 
> message format change
> -----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-3915
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3915
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.10.0.0
>            Reporter: Tommy Becker
>            Assignee: Ismael Juma
>            Priority: Blocker
>             Fix For: 0.10.0.1
>
>
> We are upgrading from Kafka 0.8.1 to 0.10.0.0 and discovered an issue after 
> getting the following exception from the log cleaner:
> {code}
> [2016-06-28 10:02:18,759] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> java.nio.BufferOverflowException
>       at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
>       at 
> kafka.message.ByteBufferMessageSet$.writeMessage(ByteBufferMessageSet.scala:169)
>       at kafka.log.Cleaner$$anonfun$cleanInto$1.apply(LogCleaner.scala:435)
>       at kafka.log.Cleaner$$anonfun$cleanInto$1.apply(LogCleaner.scala:429)
>       at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>       at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30)
>       at kafka.log.Cleaner.cleanInto(LogCleaner.scala:429)
>       at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:380)
>       at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:376)
>       at scala.collection.immutable.List.foreach(List.scala:381)
>       at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:376)
>       at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:343)
>       at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:342)
>       at scala.collection.immutable.List.foreach(List.scala:381)
>       at kafka.log.Cleaner.clean(LogCleaner.scala:342)
>       at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:237)
>       at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:215)
>       at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> {code}
> At first this seems impossible because the input and output buffers are 
> identically sized. But in the case where the source messages are of an older 
> format, additional space may be required to write them out in the new one. 
> Since the message header is 8 bytes larger in 0.10.0, this failure can 
> happen. 
> We're planning to work around this by adding the following config:
> {code}log.message.format.version=0.8.1{code} but this definitely needs a fix.
> We could simply preserve the existing message format (since in this case we 
> can't retroactively add a timestamp anyway). Otherwise, the log cleaner would 
> have to be smarter about ensuring there is sufficient "slack space" in the 
> output buffer to account for the size difference * the number of messages in 
> the input buffer. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to