[ https://issues.apache.org/jira/browse/KAFKA-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154209#comment-14154209 ]
Joel Koshy commented on KAFKA-1499: ----------------------------------- If we provide a broker-compression-enabled config: I think the problem with compaction is less of an issue than forgetting to enable the config. i.e., I agree that if an admin forgets to enable it and a user's topic has a compression.type override it is confusing if there are messages with some other compression type on the broker. With log compaction though: I think if there are heterogeneous codecs in the log then in a sense all bets are off. i.e., we can pick and choose whatever codec we want (say, the last non-non-compression codec in a batch) and not bother with preserving the retained message's compression codec. Besides, there is no guarantee that a specific producer's message is the one that that will be retained. If we do not provide a broker-compression-enabled config: The main concern I have with this is that the most likely default is going to be NoCompressionCodec. Most people will forget to set this when upgrading and end up with uncompressed data which could be an issue for users with a lot of data. Even if people have alerts on disk usage and such, there will most likely be a moderate margin (wrt typical alert thresholds) and it may not be an option to just turn on the config at that point without doing a difficult (manual) clean up first to free up space. So I guess we are down to picking the lesser of two evils - I'm not sure which one is less evil though :) Anyone have any strong preference/further critique on the pros/cons of one over the other? > Broker-side compression configuration > ------------------------------------- > > Key: KAFKA-1499 > URL: https://issues.apache.org/jira/browse/KAFKA-1499 > Project: Kafka > Issue Type: New Feature > Reporter: Joel Koshy > Assignee: Manikumar Reddy > Labels: newbie++ > Fix For: 0.8.2 > > Attachments: KAFKA-1499.patch, KAFKA-1499.patch, > KAFKA-1499_2014-08-15_14:20:27.patch, KAFKA-1499_2014-08-21_21:44:27.patch, > KAFKA-1499_2014-09-21_15:57:23.patch, KAFKA-1499_2014-09-23_14:45:38.patch, > KAFKA-1499_2014-09-24_14:20:33.patch, KAFKA-1499_2014-09-24_14:24:54.patch, > KAFKA-1499_2014-09-25_11:05:57.patch > > Original Estimate: 72h > Remaining Estimate: 72h > > A given topic can have messages in mixed compression codecs. i.e., it can > also have a mix of uncompressed/compressed messages. > It will be useful to support a broker-side configuration to recompress > messages to a specific compression codec. i.e., all messages (for all > topics) on the broker will be compressed to this codec. We could have > per-topic overrides as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)