[ https://issues.apache.org/jira/browse/KAFKA-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385866#comment-15385866 ]
ASF GitHub Bot commented on KAFKA-3915: --------------------------------------- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/1643 KAFKA-3915; Don't convert messages from v0 to v1 during log compaction The conversion is unsafe as the converted message size may be greater than the message size limit. Updated `LogCleanerIntegrationTest` to test the max message size case for both V0 and the current version. Also include a few minor clean-ups: * Remove unused expression * Avoid unintentional usage of `scala.collection.immutable.Stream` (`toSeq` on an `Iterator`) * Add explicit result type in `FileMessageSet.iterator` You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-3915-log-cleaner-io-buffers-message-conversion Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1643.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1643 ---- commit 083029d12a12ca0e750dddf08a4ce8f4ec5db8bb Author: Ismael Juma <ism...@juma.me.uk> Date: 2016-07-20T13:34:04Z Don't convert messages from version 0 to version 1 during log compaction The conversion is unsafe as the converted message size may be greater than the message size limit. commit 1262c2f87f6dd65c8624dde7f3406de7ab00cb99 Author: Ismael Juma <ism...@juma.me.uk> Date: 2016-07-20T13:35:47Z Remove unused expression, avoid usage of scala.Stream and use explicit return type for public method ---- > LogCleaner IO buffers do not account for potential size difference due to > message format change > ----------------------------------------------------------------------------------------------- > > Key: KAFKA-3915 > URL: https://issues.apache.org/jira/browse/KAFKA-3915 > Project: Kafka > Issue Type: Bug > Components: log > Affects Versions: 0.10.0.0 > Reporter: Tommy Becker > Assignee: Ismael Juma > Priority: Blocker > Fix For: 0.10.0.1 > > > We are upgrading from Kafka 0.8.1 to 0.10.0.0 and discovered an issue after > getting the following exception from the log cleaner: > {code} > [2016-06-28 10:02:18,759] ERROR [kafka-log-cleaner-thread-0], Error due to > (kafka.log.LogCleaner) > java.nio.BufferOverflowException > at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206) > at > kafka.message.ByteBufferMessageSet$.writeMessage(ByteBufferMessageSet.scala:169) > at kafka.log.Cleaner$$anonfun$cleanInto$1.apply(LogCleaner.scala:435) > at kafka.log.Cleaner$$anonfun$cleanInto$1.apply(LogCleaner.scala:429) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:429) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:380) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:376) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:376) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:343) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:342) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.clean(LogCleaner.scala:342) > at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:237) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:215) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > {code} > At first this seems impossible because the input and output buffers are > identically sized. But in the case where the source messages are of an older > format, additional space may be required to write them out in the new one. > Since the message header is 8 bytes larger in 0.10.0, this failure can > happen. > We're planning to work around this by adding the following config: > {code}log.message.format.version=0.8.1{code} but this definitely needs a fix. > We could simply preserve the existing message format (since in this case we > can't retroactively add a timestamp anyway). Otherwise, the log cleaner would > have to be smarter about ensuring there is sufficient "slack space" in the > output buffer to account for the size difference * the number of messages in > the input buffer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)