Hi!

We noticed the same here with 0.9.0.1.

To work around the issue a better way then to set a very low retention.ms is to 
set retention.bytes on a topic level, like this:

./bin/kafka-topics.sh --zookeeper X.X.X.X:2181/kafka -alter --config 
retention.bytes=5000000 –topic my_topic

The settings controls the max size in bytes of a partion oft he specified 
topic. So you can find a good size by checking the size of a partition with du 
–b and use this value.

This way you do not loose ~7 days of data and can be sure that your disks will 
not fill up.

Maybe I should add a comment in https://issues.apache.org/jira/browse/KAFKA-1379

Bye
 Moritz





Am 25.05.16, 18:34 schrieb "Andrew Otto" <o...@wikimedia.org>:

>Hiya,
>
>We’ve recently upgraded to 0.9.  In 0.8, when we restarted a broker, data
>log file mtimes were not changed.  In 0.9, any data log file that was on
>disk before the broker has it’s mtime modified to the time of the broker
>restart.
>
>This causes problems with log retention, as all the files then look like
>they contain recent data to kafka.  We use the default log retention of 7
>weeks, but if all the files are touched at the same time, this can cause us
>to retain up to 2 weeks of log data, which can fill up our disks.
>
>We saw this during our initial upgrade, but I had just thought it had
>something to do with the change of inter.broker.protocol.version, and
>assumed it wouldn’t happen again.  We just did our first broker restart
>after the upgrade, and we are seeing this again.  We worked around this
>during our upgrade by temporarily setting a high volume topic’s retention
>very low, causing brokers to purge more recent data.  This allowed us to
>avoid filling up our disks, but we shouldn’t have to do this every time we
>bounce brokers.
>
>Has anyone else noticed this?
>
>-Ao

emetriq GmbH
Steindamm 80
20099 Hamburg

Sitz der Gesellschaft: Bonn
Handelsregister: AG Bonn, HRB 20117
Geschäftsführer: Daniel Neuhaus, Claas Voigt
----------------------------------------------------------------
Wir sind Mitglied im BVDW (Bundesverband Digitale Wirtschaft)
----------------------------------------------------------------
This e-mail is confidential and is intended for the addressee(s) only.
If you are not the named addressee you may not use it, copy it or
disclose it to any other person. If you received this message in error
please notify the sender immediately.

Reply via email to