[ https://issues.apache.org/jira/browse/KAFKA-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572983#comment-15572983 ]
ASF GitHub Bot commented on KAFKA-4293: --------------------------------------- GitHub user radai-rosenblatt opened a pull request: https://github.com/apache/kafka/pull/2025 KAFKA-4293 - improve ByteBufferMessageSet.deepIterator() performance by relying on underlying stream's available() implementation also: provided better available() for ByteBufferInputStream provided better available() for KafkaLZ4BlockInputStream added KafkaGZIPInputStream with a better available() fixed KafkaLZ4BlockOutputStream.close() to properly flush Signed-off-by: radai-rosenblatt <radai.rosenbl...@gmail.com> You can merge this pull request into a Git repository by running: $ git pull https://github.com/radai-rosenblatt/kafka suchwow Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2025.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2025 ---- ---- > ByteBufferMessageSet.deepIterator burns CPU catching EOFExceptions > ------------------------------------------------------------------ > > Key: KAFKA-4293 > URL: https://issues.apache.org/jira/browse/KAFKA-4293 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.10.0.1 > Reporter: radai rosenblatt > Assignee: radai rosenblatt > > around line 110: > {noformat} > try { > while (true) > innerMessageAndOffsets.add(readMessageFromStream(compressed)) > } catch { > case eofe: EOFException => > // we don't do anything at all here, because the finally > // will close the compressed input stream, and we simply > // want to return the innerMessageAndOffsets > {noformat} > the only indication the code has that the end of the oteration was reached is > by catching EOFException (which will be thrown inside > readMessageFromStream()). > profiling runs performed at linkedIn show 10% of the total broker CPU time > taken up by Throwable.fillInStack() because of this behaviour. > unfortunately InputStream.available() cannot be relied upon (concrete example > - GZipInputStream will not correctly return 0) so the fix would probably be a > wire format change to also encode the number of messages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)