> On Aug. 10, 2014, 11:44 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/log/LogCleaner.scala, lines 400-420
> > <https://reviews.apache.org/r/24214/diff/4/?file=657033#file657033line400>
> >
> >     Thinking about this a bit more. I am wondering if it would be better if 
> > we introduce a per-topic level log.compact.compress.codec property. During 
> > log compaction, we always write the retained data using the specified 
> > compress codec, independent of whether the original records are compressed 
> > or not. This provides the following benefits.
> >     
> >     1. Whether the messages were compressed originally, they can be 
> > compressed on the broker side over time. Since compact topics preserve 
> > records much longer, enabling compression on the broker side will be 
> > beneficial in general.
> >     
> >     2. As old records are removed, we still want to batch enough messages 
> > to do the compression.
> >     
> >     3. The code can be a bit simpler. We can just (deep) iterate messages 
> > (using MemoryRecods.iterator) and append retained messages to an output 
> > MemoryRecords. The output MemoryRecords will be initialized with the 
> > configured compress codec and batch size.
> 
> Manikumar Reddy O wrote:
>     What you proposed is similar to KAFKA-1499. KAFKA-1499 deals with default 
> broker-side compression configuration.
>     I proposed new configuration properties on KAFKA-1499. The idea is to 
> compress the data upon reaching the server.
>     This is applicable all topics (log compaction and retention).
>     
>     Can you comment on KAFKA-1499?

Assuming we have broker-side compression (KAFKA-1499), Do we still need 
special-compression during log compaction?

1) With some broker-side compression (codec: gzip, snappy. etc..)

With KAFKA-1499 we will compress all the messages with specified compression 
codec. During log compaction, we write 
the retained data using same compression codec. 

2) Without broker-side compression (codec: none)

If some user is not configuring broker-side compression, then we will write the 
retained messages using their
original compression type. 

Current patch supports above points.


- Manikumar Reddy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24214/#review50128
-----------------------------------------------------------


On Aug. 9, 2014, 10:51 a.m., Manikumar Reddy O wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24214/
> -----------------------------------------------------------
> 
> (Updated Aug. 9, 2014, 10:51 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1374
>     https://issues.apache.org/jira/browse/KAFKA-1374
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> Addressed Jun's comments;Added few changes in LogCleaner stats for compressed 
> messages
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/log/LogCleaner.scala 
> c20de4ad4734c0bd83c5954fdb29464a27b91dff 
>   core/src/test/scala/unit/kafka/log/LogCleanerIntegrationTest.scala 
> 5bfa764638e92f217d0ff7108ec8f53193c22978 
> 
> Diff: https://reviews.apache.org/r/24214/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Manikumar Reddy O
> 
>

Reply via email to