[ https://issues.apache.org/jira/browse/KAFKA-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152354#comment-14152354 ]
Jay Kreps commented on KAFKA-1499: ---------------------------------- Hey Joel, that makes sense. I chatted with Neha. I understand the motivation behind the enable/disable config. Basically the concern I have here is that we are effectively making this feature work two different ways without a clear rationale for supporting both. This will be much more confusing then just providing the more sensible way. As long as there is an option to disable broker compression then compaction won't work properly (at best compaction will have to just decompress the topic which most people will think is a bug). Plus the feature will be a bit confusing to use. People will see the topic-level compression setting and set it for their topic (e.g. enable snappy compression) but nothing will happen because broker compression will be disabled at the server level. Most people will think this is just broken. Since we have to decompress and recompress the messages anyway having a uniform codec per topic is actually not a disadvantage (i.e. retaining heterogenous compression codecs given by the client will not make things more efficient). So I would advocate for removing the enable/disable flag. I think then the statement of how compression works will be this: "A compression codec is specified per topic with a broker-level default. This will compress data on disk, as well as saving network bandwidth when the data is sent to the consumer. The producer can also compress data to save network when sending data to the broker, however data will always be written to disk with the compression codec specified for that topic irrespective of the compression used by the producer". After chatting with Neha I think we were on the same page. At first we thought it would be nice to have the enable/disable flag temporarily so we could retain the current behavior, but thinking about it that just prolongs the time until we have to remove it at which point behavior will change for the user anyway. So we might as well just make the change now, rather than changing behavior twice and leaving things in an odd state in between. What do you think? > Broker-side compression configuration > ------------------------------------- > > Key: KAFKA-1499 > URL: https://issues.apache.org/jira/browse/KAFKA-1499 > Project: Kafka > Issue Type: New Feature > Reporter: Joel Koshy > Assignee: Manikumar Reddy > Labels: newbie++ > Fix For: 0.8.2 > > Attachments: KAFKA-1499.patch, KAFKA-1499.patch, > KAFKA-1499_2014-08-15_14:20:27.patch, KAFKA-1499_2014-08-21_21:44:27.patch, > KAFKA-1499_2014-09-21_15:57:23.patch, KAFKA-1499_2014-09-23_14:45:38.patch, > KAFKA-1499_2014-09-24_14:20:33.patch, KAFKA-1499_2014-09-24_14:24:54.patch, > KAFKA-1499_2014-09-25_11:05:57.patch > > Original Estimate: 72h > Remaining Estimate: 72h > > A given topic can have messages in mixed compression codecs. i.e., it can > also have a mix of uncompressed/compressed messages. > It will be useful to support a broker-side configuration to recompress > messages to a specific compression codec. i.e., all messages (for all > topics) on the broker will be compressed to this codec. We could have > per-topic overrides as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)