For those that have seen this issue on 0.9, can you provide some more insight into your environments? What OS and filesystem are you running? Do you find that you can reproduce the behavior with a simple java program that creates a file, writes to it, waits for a few minutes, then closes the file? The code for closing the log segments on shutdown should be doing nothing more than closing the file, so it would be good to see if we can flesh out the environmental details a bit. I have not been able to reproduce this issue on multiple OS's using ext4 and XFS.
On Thu, May 26, 2016 at 4:58 AM, Moritz Siuts <m.si...@emetriq.com> wrote: > Hi! > > We noticed the same here with 0.9.0.1. > > To work around the issue a better way then to set a very low retention.ms > is to set retention.bytes on a topic level, like this: > > ./bin/kafka-topics.sh --zookeeper X.X.X.X:2181/kafka -alter --config > retention.bytes=5000000 –topic my_topic > > The settings controls the max size in bytes of a partion oft he specified > topic. So you can find a good size by checking the size of a partition with > du –b and use this value. > > This way you do not loose ~7 days of data and can be sure that your disks > will not fill up. > > Maybe I should add a comment in > https://issues.apache.org/jira/browse/KAFKA-1379 > > Bye > Moritz > > > > > > Am 25.05.16, 18:34 schrieb "Andrew Otto" <o...@wikimedia.org>: > > >Hiya, > > > >We’ve recently upgraded to 0.9. In 0.8, when we restarted a broker, data > >log file mtimes were not changed. In 0.9, any data log file that was on > >disk before the broker has it’s mtime modified to the time of the broker > >restart. > > > >This causes problems with log retention, as all the files then look like > >they contain recent data to kafka. We use the default log retention of 7 > >weeks, but if all the files are touched at the same time, this can cause > us > >to retain up to 2 weeks of log data, which can fill up our disks. > > > >We saw this during our initial upgrade, but I had just thought it had > >something to do with the change of inter.broker.protocol.version, and > >assumed it wouldn’t happen again. We just did our first broker restart > >after the upgrade, and we are seeing this again. We worked around this > >during our upgrade by temporarily setting a high volume topic’s retention > >very low, causing brokers to purge more recent data. This allowed us to > >avoid filling up our disks, but we shouldn’t have to do this every time we > >bounce brokers. > > > >Has anyone else noticed this? > > > >-Ao > > emetriq GmbH > Steindamm 80 > 20099 Hamburg > > Sitz der Gesellschaft: Bonn > Handelsregister: AG Bonn, HRB 20117 > Geschäftsführer: Daniel Neuhaus, Claas Voigt > ---------------------------------------------------------------- > Wir sind Mitglied im BVDW (Bundesverband Digitale Wirtschaft) > ---------------------------------------------------------------- > This e-mail is confidential and is intended for the addressee(s) only. > If you are not the named addressee you may not use it, copy it or > disclose it to any other person. If you received this message in error > please notify the sender immediately. > -- Dustin Cote confluent.io