Victor Duchovni wrote:
On Fri, Nov 07, 2008 at 03:42:49PM -0500, Ofer Inbar wrote:
We track the number of messages in each postfix queue on our
mailservers using a program I've written. For most queues,
it simply does a readdir() and counts all files whose names don't
begin with ".", which is quick and efficient. For the deferred queue,
it does a find-style walk through the directory tree. Unfortunately,
on our fallback relays the deferred queues can get very large, and
walking the tree becomes a long and expensive endeavor.
By far the largest amount of time and I/O is spent calling stat() on
hundreds of thousands of queue files to determing that they're simple
files rather than directories to descend into. I could avoid that two
ways:
When a directory's link count is 2 it has no sub-directories, and you
don't have to lstat() the files. This is how the find(1) leaf node
optimization works on Linux (you can suppress it with -noleaf).
When a directory's link count is > 2, it has "n-2" sub-directories,
and you can stop calling lstat() after finding that many sub-directories.
1. If I could tell simply from the filename whether it's a queue file
or a subdirectory: for example, if the name is 3 characters or
longer it's a queue file, shorter than 3 it's a directory.
2. If the tree structure were static enough that I could make
assumptions about it: for example, every file in the top level is a
subdirectory, every file in a subdirectory is a queue file.
As Wietse said, with the current Postfix, hash directories have
single-letter names, so you can also:
cd /var/spool/postfix; find active ! -name '?' | wc -l
this gives a number one higher than the true result. With custom code,
You can do a bit better than "find", by not calling stat on the hash
directories either, if you see a single letter (hex-digit 0-9A-F)
filename, assume it is a directory). Since you'll then be using
the directory anyway, the lstat() call has no real penalty, just
brings the inode in core a bit earlier. So in practice, find(1)
is good enough (if it has "leaf" optimization support, as with
say gfind on systems with older native find(1) versions).
I seldom have to deal with REALLY large queues but I use this I hacked
up real quick to see if anything is starting to build up:
#!/bin/bash
echo ACTIVE
echo
find /var/spool/postfix/active/. ! -name . ! -name '?' -print |wc -l
echo
echo DEFERRED
echo
find /var/spool/postfix/deferred/. ! -name . ! -name '?' -print |wc -l
echo
echo BOUNCE
echo
find /var/spool/postfix/bounce/. ! -name . ! -name '?' -print |wc -l
echo
echo INCOMING
echo
find /var/spool/postfix/incoming/. ! -name . ! -name '?' -print |wc -l
echo
echo HOLD
echo
find /var/spool/postfix/hold/. ! -name . ! -name '?' -print |wc -l
--
Gerald