Also, it would be good to have a JIRA as it seems to be a 0.9 regression. Ismael On 6 Jun 2016 20:03, "Dustin Cote" <dus...@confluent.io> wrote:
> For those that have seen this issue on 0.9, can you provide some more > insight into your environments? What OS and filesystem are you running? > Do you find that you can reproduce the behavior with a simple java program > that creates a file, writes to it, waits for a few minutes, then closes the > file? The code for closing the log segments on shutdown should be doing > nothing more than closing the file, so it would be good to see if we can > flesh out the environmental details a bit. I have not been able to > reproduce this issue on multiple OS's using ext4 and XFS. > > On Thu, May 26, 2016 at 4:58 AM, Moritz Siuts <m.si...@emetriq.com> wrote: > > > Hi! > > > > We noticed the same here with 0.9.0.1. > > > > To work around the issue a better way then to set a very low > retention.ms > > is to set retention.bytes on a topic level, like this: > > > > ./bin/kafka-topics.sh --zookeeper X.X.X.X:2181/kafka -alter --config > > retention.bytes=5000000 –topic my_topic > > > > The settings controls the max size in bytes of a partion oft he specified > > topic. So you can find a good size by checking the size of a partition > with > > du –b and use this value. > > > > This way you do not loose ~7 days of data and can be sure that your disks > > will not fill up. > > > > Maybe I should add a comment in > > https://issues.apache.org/jira/browse/KAFKA-1379 > > > > Bye > > Moritz > > > > > > > > > > > > Am 25.05.16, 18:34 schrieb "Andrew Otto" <o...@wikimedia.org>: > > > > >Hiya, > > > > > >We’ve recently upgraded to 0.9. In 0.8, when we restarted a broker, > data > > >log file mtimes were not changed. In 0.9, any data log file that was on > > >disk before the broker has it’s mtime modified to the time of the broker > > >restart. > > > > > >This causes problems with log retention, as all the files then look like > > >they contain recent data to kafka. We use the default log retention of > 7 > > >weeks, but if all the files are touched at the same time, this can cause > > us > > >to retain up to 2 weeks of log data, which can fill up our disks. > > > > > >We saw this during our initial upgrade, but I had just thought it had > > >something to do with the change of inter.broker.protocol.version, and > > >assumed it wouldn’t happen again. We just did our first broker restart > > >after the upgrade, and we are seeing this again. We worked around this > > >during our upgrade by temporarily setting a high volume topic’s > retention > > >very low, causing brokers to purge more recent data. This allowed us to > > >avoid filling up our disks, but we shouldn’t have to do this every time > we > > >bounce brokers. > > > > > >Has anyone else noticed this? > > > > > >-Ao > > > > emetriq GmbH > > Steindamm 80 > > 20099 Hamburg > > > > Sitz der Gesellschaft: Bonn > > Handelsregister: AG Bonn, HRB 20117 > > Geschäftsführer: Daniel Neuhaus, Claas Voigt > > ---------------------------------------------------------------- > > Wir sind Mitglied im BVDW (Bundesverband Digitale Wirtschaft) > > ---------------------------------------------------------------- > > This e-mail is confidential and is intended for the addressee(s) only. > > If you are not the named addressee you may not use it, copy it or > > disclose it to any other person. If you received this message in error > > please notify the sender immediately. > > > > > > -- > Dustin Cote > confluent.io >