[
https://issues.apache.org/jira/browse/KAFKA-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154209#comment-14154209
]
Joel Koshy commented on KAFKA-1499:
-----------------------------------
If we provide a broker-compression-enabled config: I think the problem with
compaction is less of an issue than forgetting to enable the config. i.e., I
agree that if an admin forgets to enable it and a user's topic has a
compression.type override it is confusing if there are messages with some other
compression type on the broker. With log compaction though: I think if there
are heterogeneous codecs in the log then in a sense all bets are off. i.e., we
can pick and choose whatever codec we want (say, the last non-non-compression
codec in a batch) and not bother with preserving the retained message's
compression codec. Besides, there is no guarantee that a specific producer's
message is the one that that will be retained.
If we do not provide a broker-compression-enabled config: The main concern I
have with this is that the most likely default is going to be
NoCompressionCodec. Most people will forget to set this when upgrading and end
up with uncompressed data which could be an issue for users with a lot of data.
Even if people have alerts on disk usage and such, there will most likely be a
moderate margin (wrt typical alert thresholds) and it may not be an option to
just turn on the config at that point without doing a difficult (manual) clean
up first to free up space.
So I guess we are down to picking the lesser of two evils - I'm not sure which
one is less evil though :)
Anyone have any strong preference/further critique on the pros/cons of one over
the other?
> Broker-side compression configuration
> -------------------------------------
>
> Key: KAFKA-1499
> URL: https://issues.apache.org/jira/browse/KAFKA-1499
> Project: Kafka
> Issue Type: New Feature
> Reporter: Joel Koshy
> Assignee: Manikumar Reddy
> Labels: newbie++
> Fix For: 0.8.2
>
> Attachments: KAFKA-1499.patch, KAFKA-1499.patch,
> KAFKA-1499_2014-08-15_14:20:27.patch, KAFKA-1499_2014-08-21_21:44:27.patch,
> KAFKA-1499_2014-09-21_15:57:23.patch, KAFKA-1499_2014-09-23_14:45:38.patch,
> KAFKA-1499_2014-09-24_14:20:33.patch, KAFKA-1499_2014-09-24_14:24:54.patch,
> KAFKA-1499_2014-09-25_11:05:57.patch
>
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> A given topic can have messages in mixed compression codecs. i.e., it can
> also have a mix of uncompressed/compressed messages.
> It will be useful to support a broker-side configuration to recompress
> messages to a specific compression codec. i.e., all messages (for all
> topics) on the broker will be compressed to this codec. We could have
> per-topic overrides as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)