Hello Kafka Users, 
Hoping for best to hear any thoughts from you guys on following questions.

Has anybody enabled and tried compression of logs? 
What are the steps to configure compression of logs on disk? 
Is there any option to compress existing logs already persisted on disk? 
Why is compression of logs is not enabled by default? 

I have attempted fruitless efforts to setup compression of logs using settings 
mentioned on following wiki - 
https://cwiki.apache.org/confluence/display/KAFKA/Compression . 
Although this page is 4 years old, not sure if this it is valid anymore. 

Thanks,
-
R P

________________________________________
From: Ben Stopford <b...@confluent.io>
Sent: Friday, March 18, 2016 9:21 AM
To: users@kafka.apache.org
Subject: Re: Question regarding compression of topics in Kafka

Assuming you’re using the new producer (org.apache.kafka.clients.producer) the 
property is called compression.type I believe.

Double check it’s being passed correctly to the process. The producer logs the 
properties it uses if you set the logging level to info.

B
> On 18 Mar 2016, at 16:10, R P <hadoo...@outlook.com> wrote:
>
> Hey Ben, Thanks again for your response.
>
> I checked log files using DumpLogSegments --print-data-log. And compression 
> codec used is showing NoCompressionCodec ( compresscodec: NoCompressionCodec 
> ) .
>
> I am guessing my configuration is not correct. I am adding following line in 
> Kafka producer.properties config file. I am using Kafka 0.8.2
>
> compression.codec=gzip
> # compression.codec=1 (tried with old config value too)
>
> And for this experiment I am sending data via kafka-console-producer.sh. 
> Still I don't see any compression being used.
>
> What am I missing?
>
> Thanks,
> R P
>
>
> On 3/18/16 8:24 AM, R P wrote:
>> Thanks for the response Ben.
>> I am wondering why is "du" command not showing reduced size when compression 
>> is used.
>> I ran an experiment with compression enabled on a topic and without 
>> compression enabled sending same amount of data in both cases. I used single 
>> node 1 replication factor Kafka instance on Mac OS.
>> I didn't see any difference in the data size stored on disk. In both cases 
>> data stored on disk in log files had same size equals to the data sent to 
>> Kafka.
>> How do I verify that compression is being used and data stored on disk has 
>> savings in space due to compression?
>> Thanks,
>> R P
>>
>> _____________________________
>> From: Ben Stopford <b...@confluent.io 
>> <mailto:b...@confluent.io><mailto:b...@confluent.io> 
>> <mailto:b...@confluent.io>>
>> Sent: Friday, March 18, 2016 7:50 AM
>> Subject: Re: Question regarding compression of topics in Kafka
>> To: <users@kafka.apache.org 
>> <mailto:users@kafka.apache.org><mailto:users@kafka.apache.org> 
>> <mailto:users@kafka.apache.org>>
>>
>>
>> Yes it will compress the data stored on the file system if you specify 
>> compression in the producer. You can check whether the data is compressed on 
>> disk by running the following command in the data directory.
>> kafka-run-class kafka.tools.DumpLogSegments --print-data-log --files 
>> latest-log-file.log
>>
>>> > On 17 Mar 2016, at 23:59, R P <hadoo...@outlook.com 
>>> > <mailto:hadoo...@outlook.com><mailto:hadoo...@outlook.com> 
>>> > <mailto:hadoo...@outlook.com>> wrote:
>>> >
>>> > Hello All,
>>> > Does kafka support compressing storage logs stored in log dir?
>>> > What does compression.type=(gzip/snappy) in server.properties do?
>>> >
>>> > Based on documents I am assuming that it will compress the logs on local
>>> > file system.
>>> > I ran a quick experiment and found that my logs stored on local disk are
>>> > not getting compressed.
>>> > Size of data stored on disk is same with or without compression.
>>> >
>>> > I am using following configuration properties in server.properties
>>> > config file.
>>> >
>>> > compression.type=gzip
>>> > compressed.topics="gzip-topic"
>>> >
>>> > Thanks for reading and appreciate any responses.
>>> >
>>> > Thanks,
>>> > R P
>>
>>
>

Reply via email to