[ https://issues.apache.org/jira/browse/KAFKA-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382880#comment-14382880 ]
Jay Kreps commented on KAFKA-527: --------------------------------- Do we have any kind of before/after performance assessment from the producer's point of view? It would be nice to be able to say that you guys made the producer X% faster. Even just a simple test over localhost would be good. :-) > Compression support does numerous byte copies > --------------------------------------------- > > Key: KAFKA-527 > URL: https://issues.apache.org/jira/browse/KAFKA-527 > Project: Kafka > Issue Type: Bug > Components: compression > Reporter: Jay Kreps > Assignee: Yasuhiro Matsuda > Priority: Critical > Attachments: KAFKA-527.message-copy.history, KAFKA-527.patch, > KAFKA-527_2015-03-16_15:19:29.patch, KAFKA-527_2015-03-19_21:32:24.patch, > KAFKA-527_2015-03-25_12:08:00.patch, KAFKA-527_2015-03-25_13:26:36.patch, > java.hprof.no-compression.txt, java.hprof.snappy.text > > > The data path for compressing or decompressing messages is extremely > inefficient. We do something like 7 (?) complete copies of the data, often > for simple things like adding a 4 byte size to the front. I am not sure how > this went by unnoticed. > This is likely the root cause of the performance issues we saw in doing bulk > recompression of data in mirror maker. > The mismatch between the InputStream and OutputStream interfaces and the > Message/MessageSet interfaces which are based on byte buffers is the cause of > many of these. -- This message was sent by Atlassian JIRA (v6.3.4#6332)