Re: Brokers changing mtime on data files during startup?

Ismael Juma Tue, 07 Jun 2016 01:27:06 -0700

Also, it would be good to have a JIRA as it seems to be a 0.9 regression.

Ismael
On 6 Jun 2016 20:03, "Dustin Cote" <dus...@confluent.io> wrote:


> For those that have seen this issue on 0.9, can you provide some more
> insight into your environments?  What OS and filesystem are you running?
> Do you find that you can reproduce the behavior with a simple java program
> that creates a file, writes to it, waits for a few minutes, then closes the
> file?  The code for closing the log segments on shutdown should be doing
> nothing more than closing the file, so it would be good to see if we can
> flesh out the environmental details a bit.  I have not been able to
> reproduce this issue on multiple OS's using ext4 and XFS.
>
> On Thu, May 26, 2016 at 4:58 AM, Moritz Siuts <m.si...@emetriq.com> wrote:
>
> > Hi!
> >
> > We noticed the same here with 0.9.0.1.
> >
> > To work around the issue a better way then to set a very low
> retention.ms
> > is to set retention.bytes on a topic level, like this:
> >
> > ./bin/kafka-topics.sh --zookeeper X.X.X.X:2181/kafka -alter --config
> > retention.bytes=5000000 –topic my_topic
> >
> > The settings controls the max size in bytes of a partion oft he specified
> > topic. So you can find a good size by checking the size of a partition
> with
> > du –b and use this value.
> >
> > This way you do not loose ~7 days of data and can be sure that your disks
> > will not fill up.
> >
> > Maybe I should add a comment in
> > https://issues.apache.org/jira/browse/KAFKA-1379
> >
> > Bye
> >  Moritz
> >
> >
> >
> >
> >
> > Am 25.05.16, 18:34 schrieb "Andrew Otto" <o...@wikimedia.org>:
> >
> > >Hiya,
> > >
> > >We’ve recently upgraded to 0.9.  In 0.8, when we restarted a broker,
> data
> > >log file mtimes were not changed.  In 0.9, any data log file that was on
> > >disk before the broker has it’s mtime modified to the time of the broker
> > >restart.
> > >
> > >This causes problems with log retention, as all the files then look like
> > >they contain recent data to kafka.  We use the default log retention of
> 7
> > >weeks, but if all the files are touched at the same time, this can cause
> > us
> > >to retain up to 2 weeks of log data, which can fill up our disks.
> > >
> > >We saw this during our initial upgrade, but I had just thought it had
> > >something to do with the change of inter.broker.protocol.version, and
> > >assumed it wouldn’t happen again.  We just did our first broker restart
> > >after the upgrade, and we are seeing this again.  We worked around this
> > >during our upgrade by temporarily setting a high volume topic’s
> retention
> > >very low, causing brokers to purge more recent data.  This allowed us to
> > >avoid filling up our disks, but we shouldn’t have to do this every time
> we
> > >bounce brokers.
> > >
> > >Has anyone else noticed this?
> > >
> > >-Ao
> >
> > emetriq GmbH
> > Steindamm 80
> > 20099 Hamburg
> >
> > Sitz der Gesellschaft: Bonn
> > Handelsregister: AG Bonn, HRB 20117
> > Geschäftsführer: Daniel Neuhaus, Claas Voigt
> > ----------------------------------------------------------------
> > Wir sind Mitglied im BVDW (Bundesverband Digitale Wirtschaft)
> > ----------------------------------------------------------------
> > This e-mail is confidential and is intended for the addressee(s) only.
> > If you are not the named addressee you may not use it, copy it or
> > disclose it to any other person. If you received this message in error
> > please notify the sender immediately.
> >
>
>
>
> --
> Dustin Cote
> confluent.io
>

Re: Brokers changing mtime on data files during startup?

Reply via email to