[ https://issues.apache.org/jira/browse/KAFKA-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558017#comment-14558017 ]
Manikumar Reddy commented on KAFKA-2213: ---------------------------------------- CASE A: In normal scenarios, log are already written with broker/topic compression type. During compaction, we are just compacting with same compression type. CASE B: In some scenarios, we may change the compression type of existing topics using per-topic compression config. In this case, logs may have messages with different compression types. So we want to handle this during compaction? Currently during compaction, we are decompressing a message, and writing back compacted message(less size). So we are maintaining producer side batching (with less size). In CASE B, we have to change the compression type(non-compressed messages -> compressed, compressed -> non-compressed, compressed -> compressed). AFAIK, batching on producer side controlled by batch.size( in bytes) config. Do we need to introduce similar server side parameter in bytes/no. of messages. > Log cleaner should write compacted messages using configured compression type > ----------------------------------------------------------------------------- > > Key: KAFKA-2213 > URL: https://issues.apache.org/jira/browse/KAFKA-2213 > Project: Kafka > Issue Type: Bug > Reporter: Joel Koshy > > In KAFKA-1374 the log cleaner was improved to handle compressed messages. > There were a couple of follow-ups from that: > * We write compacted messages using the original compression type in the > compressed message-set. We should instead append all retained messages with > the configured broker compression type of the topic. > * While compressing messages we should ideally do some batching before > compression. > * Investigate the use of the client compressor. (See the discussion in the > RBs for KAFKA-1374) -- This message was sent by Atlassian JIRA (v6.3.4#6332)