[
https://issues.apache.org/jira/browse/KAFKA-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152354#comment-14152354
]
Jay Kreps commented on KAFKA-1499:
----------------------------------
Hey Joel, that makes sense.
I chatted with Neha. I understand the motivation behind the enable/disable
config. Basically the concern I have here is that we are effectively making
this feature work two different ways without a clear rationale for supporting
both. This will be much more confusing then just providing the more sensible
way.
As long as there is an option to disable broker compression then compaction
won't work properly (at best compaction will have to just decompress the topic
which most people will think is a bug). Plus the feature will be a bit
confusing to use. People will see the topic-level compression setting and set
it for their topic (e.g. enable snappy compression) but nothing will happen
because broker compression will be disabled at the server level. Most people
will think this is just broken.
Since we have to decompress and recompress the messages anyway having a uniform
codec per topic is actually not a disadvantage (i.e. retaining heterogenous
compression codecs given by the client will not make things more efficient).
So I would advocate for removing the enable/disable flag. I think then the
statement of how compression works will be this:
"A compression codec is specified per topic with a broker-level default. This
will compress data on disk, as well as saving network bandwidth when the data
is sent to the consumer. The producer can also compress data to save network
when sending data to the broker, however data will always be written to disk
with the compression codec specified for that topic irrespective of the
compression used by the producer".
After chatting with Neha I think we were on the same page. At first we thought
it would be nice to have the enable/disable flag temporarily so we could retain
the current behavior, but thinking about it that just prolongs the time until
we have to remove it at which point behavior will change for the user anyway.
So we might as well just make the change now, rather than changing behavior
twice and leaving things in an odd state in between. What do you think?
> Broker-side compression configuration
> -------------------------------------
>
> Key: KAFKA-1499
> URL: https://issues.apache.org/jira/browse/KAFKA-1499
> Project: Kafka
> Issue Type: New Feature
> Reporter: Joel Koshy
> Assignee: Manikumar Reddy
> Labels: newbie++
> Fix For: 0.8.2
>
> Attachments: KAFKA-1499.patch, KAFKA-1499.patch,
> KAFKA-1499_2014-08-15_14:20:27.patch, KAFKA-1499_2014-08-21_21:44:27.patch,
> KAFKA-1499_2014-09-21_15:57:23.patch, KAFKA-1499_2014-09-23_14:45:38.patch,
> KAFKA-1499_2014-09-24_14:20:33.patch, KAFKA-1499_2014-09-24_14:24:54.patch,
> KAFKA-1499_2014-09-25_11:05:57.patch
>
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> A given topic can have messages in mixed compression codecs. i.e., it can
> also have a mix of uncompressed/compressed messages.
> It will be useful to support a broker-side configuration to recompress
> messages to a specific compression codec. i.e., all messages (for all
> topics) on the broker will be compressed to this codec. We could have
> per-topic overrides as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)