Victor Duchovni wrote:
On Fri, Nov 07, 2008 at 03:42:49PM -0500, Ofer Inbar wrote:

We track the number of messages in each postfix queue on our
mailservers using a program I've written.  For most queues,
it simply does a readdir() and counts all files whose names don't
begin with ".", which is quick and efficient.  For the deferred queue,
it does a find-style walk through the directory tree.  Unfortunately,
on our fallback relays the deferred queues can get very large, and
walking the tree becomes a long and expensive endeavor.

By far the largest amount of time and I/O is spent calling stat() on
hundreds of thousands of queue files to determing that they're simple
files rather than directories to descend into.  I could avoid that two
ways:

When a directory's link count is 2 it has no sub-directories, and you
don't have to lstat() the files. This is how the find(1) leaf node
optimization works on Linux (you can suppress it with -noleaf).

When a directory's link count is > 2, it has "n-2" sub-directories,
and you can stop calling lstat() after finding that many sub-directories.

1. If I could tell simply from the filename whether it's a queue file
   or a subdirectory: for example, if the name is 3 characters or
   longer it's a queue file, shorter than 3 it's a directory.

2. If the tree structure were static enough that I could make
   assumptions about it: for example, every file in the top level is a
   subdirectory, every file in a subdirectory is a queue file.

As Wietse said, with the current Postfix, hash directories have
single-letter names, so you can also:

    cd /var/spool/postfix; find active ! -name '?' | wc -l

this gives a number one higher than the true result. With custom code,
You can do a bit better than "find", by not calling stat on the hash
directories either, if you see a single letter (hex-digit 0-9A-F)
filename, assume it is a directory). Since you'll then be using
the directory anyway, the lstat() call has no real penalty, just
brings the inode in core a bit earlier. So in practice, find(1)
is good enough (if it has "leaf" optimization support, as with
say gfind on systems with older native find(1) versions).


I seldom have to deal with REALLY large queues but I use this I hacked up real quick to see if anything is starting to build up:

#!/bin/bash
echo ACTIVE
echo
find /var/spool/postfix/active/. ! -name . ! -name '?' -print |wc -l
echo
echo DEFERRED
echo
find /var/spool/postfix/deferred/. ! -name . ! -name '?' -print |wc -l
echo
echo BOUNCE
echo
find /var/spool/postfix/bounce/. ! -name . ! -name '?' -print |wc -l
echo
echo INCOMING
echo
find /var/spool/postfix/incoming/. ! -name . ! -name '?' -print |wc -l
echo
echo HOLD
echo
find /var/spool/postfix/hold/. ! -name . ! -name '?' -print |wc -l

--
Gerald

Reply via email to