[ 
https://issues.apache.org/jira/browse/KAFKA-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152354#comment-14152354
 ] 

Jay Kreps commented on KAFKA-1499:
----------------------------------

Hey Joel, that makes sense.

I chatted with Neha. I understand the motivation behind the enable/disable 
config. Basically the concern I have here is that we are effectively making 
this feature work two different ways without a clear rationale for supporting 
both. This will be much more confusing then just providing the more sensible 
way.

As long as there is an option to disable broker compression then compaction 
won't work properly (at best compaction will have to just decompress the topic 
which most people will think is a bug). Plus the feature will be a bit 
confusing to use. People will see the topic-level compression setting and set 
it for their topic (e.g. enable snappy compression) but nothing will happen 
because broker compression will be disabled at the server level. Most people 
will think this is just broken.

Since we have to decompress and recompress the messages anyway having a uniform 
codec per topic is actually not a disadvantage (i.e. retaining heterogenous 
compression codecs given by the client will not make things more efficient).

So I would advocate for removing the enable/disable flag. I think then the 
statement of how compression works will be this: 
"A compression codec is specified per topic with a broker-level default. This 
will compress data on disk, as well as saving network bandwidth when the data 
is sent to the consumer. The producer can also compress data to save network 
when sending data to the broker, however data will always be written to disk 
with the compression codec specified for that topic irrespective of the 
compression used by the producer".

After chatting with Neha I think we were on the same page. At first we thought 
it would be nice to have the enable/disable flag temporarily so we could retain 
the current behavior, but thinking about it that just prolongs the time until 
we have to remove it at which point behavior will change for the user anyway. 
So we might as well just make the change now, rather than changing behavior 
twice and leaving things in an odd state in between. What do you think?

> Broker-side compression configuration
> -------------------------------------
>
>                 Key: KAFKA-1499
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1499
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Joel Koshy
>            Assignee: Manikumar Reddy
>              Labels: newbie++
>             Fix For: 0.8.2
>
>         Attachments: KAFKA-1499.patch, KAFKA-1499.patch, 
> KAFKA-1499_2014-08-15_14:20:27.patch, KAFKA-1499_2014-08-21_21:44:27.patch, 
> KAFKA-1499_2014-09-21_15:57:23.patch, KAFKA-1499_2014-09-23_14:45:38.patch, 
> KAFKA-1499_2014-09-24_14:20:33.patch, KAFKA-1499_2014-09-24_14:24:54.patch, 
> KAFKA-1499_2014-09-25_11:05:57.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> A given topic can have messages in mixed compression codecs. i.e., it can
> also have a mix of uncompressed/compressed messages.
> It will be useful to support a broker-side configuration to recompress
> messages to a specific compression codec. i.e., all messages (for all
> topics) on the broker will be compressed to this codec. We could have
> per-topic overrides as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to