> On Aug. 10, 2014, 11:44 p.m., Jun Rao wrote: > > core/src/main/scala/kafka/log/LogCleaner.scala, lines 400-420 > > <https://reviews.apache.org/r/24214/diff/4/?file=657033#file657033line400> > > > > Thinking about this a bit more. I am wondering if it would be better if > > we introduce a per-topic level log.compact.compress.codec property. During > > log compaction, we always write the retained data using the specified > > compress codec, independent of whether the original records are compressed > > or not. This provides the following benefits. > > > > 1. Whether the messages were compressed originally, they can be > > compressed on the broker side over time. Since compact topics preserve > > records much longer, enabling compression on the broker side will be > > beneficial in general. > > > > 2. As old records are removed, we still want to batch enough messages > > to do the compression. > > > > 3. The code can be a bit simpler. We can just (deep) iterate messages > > (using MemoryRecods.iterator) and append retained messages to an output > > MemoryRecords. The output MemoryRecords will be initialized with the > > configured compress codec and batch size. > > Manikumar Reddy O wrote: > What you proposed is similar to KAFKA-1499. KAFKA-1499 deals with default > broker-side compression configuration. > I proposed new configuration properties on KAFKA-1499. The idea is to > compress the data upon reaching the server. > This is applicable all topics (log compaction and retention). > > Can you comment on KAFKA-1499?
Assuming we have broker-side compression (KAFKA-1499), Do we still need special-compression during log compaction? 1) With some broker-side compression (codec: gzip, snappy. etc..) With KAFKA-1499 we will compress all the messages with specified compression codec. During log compaction, we write the retained data using same compression codec. 2) Without broker-side compression (codec: none) If some user is not configuring broker-side compression, then we will write the retained messages using their original compression type. Current patch supports above points. - Manikumar Reddy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24214/#review50128 ----------------------------------------------------------- On Aug. 9, 2014, 10:51 a.m., Manikumar Reddy O wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/24214/ > ----------------------------------------------------------- > > (Updated Aug. 9, 2014, 10:51 a.m.) > > > Review request for kafka. > > > Bugs: KAFKA-1374 > https://issues.apache.org/jira/browse/KAFKA-1374 > > > Repository: kafka > > > Description > ------- > > Addressed Jun's comments;Added few changes in LogCleaner stats for compressed > messages > > > Diffs > ----- > > core/src/main/scala/kafka/log/LogCleaner.scala > c20de4ad4734c0bd83c5954fdb29464a27b91dff > core/src/test/scala/unit/kafka/log/LogCleanerIntegrationTest.scala > 5bfa764638e92f217d0ff7108ec8f53193c22978 > > Diff: https://reviews.apache.org/r/24214/diff/ > > > Testing > ------- > > > Thanks, > > Manikumar Reddy O > >