Sorry, I forgot to close the loop on my experiences with this....

We solved this issue by setting the 'allocsize' mount option, in the fstab.
 E.g. allocsize=16M.

Jason


On Mon, Sep 9, 2013 at 1:47 PM, Jay Kreps <jay.kr...@gmail.com> wrote:

> This could certainly be done. It would be slightly involved since you would
> need to implement some kind of file-handle cache for both indexes and log
> files and re-open them on demand when a read occurs. If someone wants to
> take a shot at this, the first step would be to get a design wiki in place
> on how this would work. This is potentially nice to reduce the open file
> count (though open files are pretty cheap).
>
> That said this issue only impacts xfs and it seems to be fixed by that
> setting jonathan found. I wonder if you could give that a try and see if it
> works for you too? I feel dealing with closed files does add a lot of
> complexity so if there is an easy fix I would probably rather avoid it.
>
> -Jay
>
>
> On Mon, Sep 9, 2013 at 8:17 AM, Mike Heffner <m...@librato.com> wrote:
>
> > We are also seeing this problem with version 0.7.1 and logs on an XFS
> > partition. At our largest scale we can frequently free over 600GB of disk
> > usage by simply restarting Kafka. We've examined the `lsof` output from
> the
> > Kafka process and while it does appear to have FDs open for all log files
> > on disk (even those long past read from), it does not have any files open
> > that were previously deleted from disk.
> >
> > Du output agrees that the seen size is much larger than apparent-size
> size:
> >
> > root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h
> > 00000000242666442619.kafka
> > 1.1G 00000000242666442619.kafka
> > root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h --apparent-size
> > 00000000242666442619.kafka
> > 513M 00000000242666442619.kafka
> >
> >
> > Our log size/retention policy is:
> >
> > log.file.size=536870912
> > log.retention.hours=96
> >
> > We tried dropping the caches from the Stack Overflow suggestion (sync;
> echo
> > 3 > /proc/sys/vm/drop_caches) but that didn't seem to clear up the extra
> > space. Haven't had the chance to try remounting with the allocsize
> option.
> >
> > In summary, it would be great if Kafka would close FD's to log files that
> > hadn't been read from for some period of time if it addresses this issue.
> >
> > Cheers,
> >
> > Mike
> >
> > On Fri, Jul 26, 2013 at 5:03 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
> >
> > > Cool, good to know.
> > >
> > >
> > > On Fri, Jul 26, 2013 at 2:00 PM, Jason Rosenberg <j...@squareup.com>
> > wrote:
> > >
> > > > Jay,
> > > >
> > > > My only experience so far with this is using XFS.  It appears the XFS
> > > > behavior is evolving, and in fact, we see somewhat different behavior
> > > from
> > > > 2 of our CentOS kernel versions in use.  I've been trying to ask
> > > questions
> > > > about all this on the XFS.org mailing list, but so far, having not
> much
> > > > luck understanding the xfs versioning correlated to CentOS versions.
> > > >
> > > > Anyway, yes, I think it would definitely be worth trying the solution
> > you
> > > > suggest, which would be to close the file on rotation, and re-open
> > > > read-only.  Or to close files after a few hours of not being
> accessed.
> > > If
> > > > a patch for one of these approaches can be cobbled together, I'd love
> > to
> > > > test it out on our staging environment.  I'd be willing to experiment
> > > with
> > > > such a patch myself, although I'm not 100% of all the places to look
> > (but
> > > > might dive in).
> > > >
> > > > Xfs appears to the option of using dynamic, speculative
> preallocation,
> > in
> > > > which case it progressively doubles the amount of space reserved for
> a
> > > > file, as the file grows.  It does do this for all open files.  If the
> > > file
> > > > is closed, it will then release the preallocated space not in use.
> >  It's
> > > > not clear whether this releasing of space happens immediately on
> close,
> > > and
> > > > whether re-opening the file read-only immediately, will keep it from
> > > > releasing space (still trying to gather more info on that).
> > > >
> > > > I haven't looked too much at the index files, but those too appear to
> > > have
> > > > this behavior (e.g. preallocated size is always on the order of
> double
> > > the
> > > > actual size, until the app is restarted).
> > > >
> > > > Jason
> > > >
> > > >
> > > > On Fri, Jul 26, 2013 at 12:46 PM, Jay Kreps <jay.kr...@gmail.com>
> > wrote:
> > > >
> > > > > Interesting.
> > > > >
> > > > > Yes, Kafka keeps all log files open indefinitely. There is no
> > inherent
> > > > > reason this needs to be the case, though, it would be possible to
> LRU
> > > out
> > > > > old file descriptors and close them if they are not accessed for a
> > few
> > > > > hours and then reopen on the first access. We just haven't
> > implemented
> > > > > anything like that.
> > > > >
> > > > > It would be good to understand this a little better. Does xfs
> > > > pre-allocate
> > > > > space for all open files? Perhaps just closing the file on log role
> > and
> > > > > opening it read-only would solve the issue? Is this at all related
> to
> > > the
> > > > > use of sparse files for the indexes (i.e.
> > > > RandomAccessFile.setLength(10MB)
> > > > > when we create the index)? Does this effect other filesystems or
> just
> > > > xfs?
> > > > >
> > > > > -Jay
> > > > >
> > > > >
> > > > > On Fri, Jul 26, 2013 at 12:42 AM, Jason Rosenberg <
> j...@squareup.com>
> > > > > wrote:
> > > > >
> > > > > > It looks like xfs will reclaim the preallocated space for a file,
> > > after
> > > > > it
> > > > > > is closed.
> > > > > >
> > > > > > Does kafka close a file after it has reached it's max size and
> > > started
> > > > > > writing to the next log file in sequence?  Or does it keep all
> open
> > > > until
> > > > > > they are deleted, or the server quits (that's what it seems
> like).
> > > > > >
> > > > > > I could imagine that it might need to keep log files open, in
> order
> > > to
> > > > > > allow consumers access to them.  But does it keep them open
> > > > indefinitely,
> > > > > > after there is no longer any data to be written to them, and no
> > > > consumers
> > > > > > are currently attempting to read from them?
> > > > > >
> > > > > > Jason
> > > > > >
> > > > > >
> > > > > > On Tue, Jul 16, 2013 at 4:32 PM, Jay Kreps <jay.kr...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Interesting. Yes it will respect whatever setting it is given
> for
> > > new
> > > > > > > segments created from that point on.
> > > > > > >
> > > > > > > -Jay
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Jul 16, 2013 at 11:23 AM, Jason Rosenberg <
> > > j...@squareup.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Ok,
> > > > > > > >
> > > > > > > > An update on this.  It seems we are using XFS, which is
> > available
> > > > in
> > > > > > > newer
> > > > > > > > versions of Centos.  It definitely does pre-allocate space
> as a
> > > > file
> > > > > > > grows,
> > > > > > > > see:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://serverfault.com/questions/406069/why-are-my-xfs-filesystems-suddenly-consuming-more-space-and-full-of-sparse-file
> > > > > > > >
> > > > > > > > Apparently it's not hard-allocated space, and would be
> released
> > > > under
> > > > > > > > resource pressure....seems we may need to update how we
> monitor
> > > > disk
> > > > > > > space
> > > > > > > > usage, etc....
> > > > > > > >
> > > > > > > > But, it seems that the default log file size of 1.1Gb, causes
> > it
> > > to
> > > > > > jump
> > > > > > > to
> > > > > > > > preallocate an extra Gb.  So, in theory, if  I set a
> strategic
> > > log
> > > > > file
> > > > > > > > size to be just under the threshold that forces it to
> > > exponentially
> > > > > > > double
> > > > > > > > the size from 1Gb to 2Gb, I should be able to mostly solve
> this
> > > > > issue.
> > > > > > > >  E.g. use 950Mb instead of 1.1Gb max log file size.
> > > > > > > >
> > > > > > > > If I change the max log file size for a broker, and restart
> it,
> > > > will
> > > > > it
> > > > > > > > respect the new size going forward?
> > > > > > > >
> > > > > > > > Jason
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Jul 14, 2013 at 9:44 AM, Jay Kreps <
> > jay.kr...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey Jason,
> > > > > > > > >
> > > > > > > > > As Jun says, we haven't seen that issue and no one else has
> > > > > reported
> > > > > > > that
> > > > > > > > > but it sounds like a bug of some kind.
> > > > > > > > >
> > > > > > > > > In 0.7 we don't do any preallocation of anything. The only
> > time
> > > > > files
> > > > > > > > > shrink is during recovery--we re-checksum all messages that
> > may
> > > > not
> > > > > > > have
> > > > > > > > > been flushed and if any invalid messages are found we
> > truncate
> > > > them
> > > > > > > off.
> > > > > > > > > This would only occur after an unclean shutdown (kill -9)
> and
> > > you
> > > > > > would
> > > > > > > > get
> > > > > > > > > a bunch of errors in log warning you this was happening.
> > > > > > > > >
> > > > > > > > > -Jay
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Jul 14, 2013 at 1:36 AM, Jason Rosenberg <
> > > > j...@squareup.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > An update on this.  It appears that the phenomenon I'm
> > seeing
> > > > is
> > > > > > that
> > > > > > > > > disk
> > > > > > > > > > space is freed on restart, but it's not due files getting
> > > > deleted
> > > > > > on
> > > > > > > > > > restart, but instead files are getting truncated on
> > restart.
> > > >  It
> > > > > > > > appears
> > > > > > > > > > that log files get pre-allocated to a larger size than is
> > > used
> > > > > > right
> > > > > > > > > away.
> > > > > > > > > >  Upon restart, they get truncated to the size of the file
> > > that
> > > > > > > actually
> > > > > > > > > > contains data.  Does this make sense?
> > > > > > > > > >
> > > > > > > > > > Before restart, I see a large number of log files size
> > 2.1Gb.
> > > > >  Upon
> > > > > > > > > > restart, the disk space reclaimed drops to almost half
> > that,
> > > on
> > > > > > > > average.
> > > > > > > > > >
> > > > > > > > > > Thoughts?
> > > > > > > > > >
> > > > > > > > > > Jason
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, May 23, 2013 at 8:55 PM, Jun Rao <
> jun...@gmail.com
> > >
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I haven't seen this issue before. We do have ~1K topics
> > in
> > > > one
> > > > > of
> > > > > > > the
> > > > > > > > > > Kafka
> > > > > > > > > > > clusters at LinkedIn.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, May 23, 2013 at 11:05 AM, Jason Rosenberg <
> > > > > > > j...@squareup.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Yeah, that's what it looks like to me (looking at the
> > > > code).
> > > > > >  So,
> > > > > > > > I'm
> > > > > > > > > > > > guessing it's some os level caching, resource
> > recycling.
> > > > >  Have
> > > > > > > you
> > > > > > > > > ever
> > > > > > > > > > > > heard of this happening?  One thing that might be
> > > different
> > > > > in
> > > > > > my
> > > > > > > > > usage
> > > > > > > > > > > > from the norm is a relatively large number of topics
> > > (e.g.
> > > > > ~2K
> > > > > > > > > topics).
> > > > > > > > > > > >
> > > > > > > > > > > > Jason
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, May 23, 2013 at 7:14 AM, Jun Rao <
> > > jun...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Jason,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Kafka closes the handler of all delete files.
> > > Otherwise,
> > > > > the
> > > > > > > > broker
> > > > > > > > > > > will
> > > > > > > > > > > > > run out of file handler quickly.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, May 22, 2013 at 10:17 PM, Jason Rosenberg <
> > > > > > > > > j...@squareup.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > So, does this indicate kafka (or the jvm itself)
> is
> > > not
> > > > > > > > > > aggressively
> > > > > > > > > > > > > > closing file handles of deleted files?  Is there
> a
> > > fix
> > > > > for
> > > > > > > > this?
> > > > > > > > > >  Or
> > > > > > > > > > > is
> > > > > > > > > > > > > > there not likely anything to be done?  What
> happens
> > > if
> > > > > the
> > > > > > > disk
> > > > > > > > > > fills
> > > > > > > > > > > > up
> > > > > > > > > > > > > > with file handles for phantom deleted files?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jason
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, May 22, 2013 at 9:50 PM, Jonathan Creasy
> <
> > > > > > j...@box.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It isn't uncommon if a process has an open file
> > > > handle
> > > > > > on a
> > > > > > > > > file
> > > > > > > > > > > that
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > deleted, the space is not freed until the
> handle
> > is
> > > > > > closed.
> > > > > > > > So
> > > > > > > > > > > > > restarting
> > > > > > > > > > > > > > > the process that has a handle on the file would
> > > cause
> > > > > the
> > > > > > > > space
> > > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > > freed
> > > > > > > > > > > > > > > also.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > You can troubleshoot that with lsof.
> > > > > > > > > > > > > > > Normally, I see 2-4 log segments deleted every
> > hour
> > > > in
> > > > > my
> > > > > > > > > > brokers.
> > > > > > > > > > >  I
> > > > > > > > > > > > > see
> > > > > > > > > > > > > > > log lines like this:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2013-05-23 04:40:06,857  INFO
> > [kafka-logcleaner-0]
> > > > > > > > > > log.LogManager -
> > > > > > > > > > > > > > > Deleting log segment 00000000035434043157.kafka
> > > from
> > > > > > > > <redacted
> > > > > > > > > > > topic>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, it seems like if I restart the
> broker, a
> > > > > massive
> > > > > > > > > amount
> > > > > > > > > > of
> > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > space is freed (without a corresponding flood
> of
> > > > these
> > > > > > log
> > > > > > > > > > segment
> > > > > > > > > > > > > > deleted
> > > > > > > > > > > > > > > messages).  Is there an explanation for this?
> >  Does
> > > > > kafka
> > > > > > > > keep
> > > > > > > > > > > > > reference
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > file segments around, and reuse them as needed
> or
> > > > > > > something?
> > > > > > > > >  And
> > > > > > > > > > > > then
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > restart, the references to those free segment
> > files
> > > > are
> > > > > > > > > dropped?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thoughts?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This is with 0.7.2.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jason
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> >   Mike Heffner <m...@librato.com>
> >   Librato, Inc.
> >
>

Reply via email to