Victor Duchovni:
> On Mon, Mar 21, 2011 at 07:15:44AM -0400, Wietse Venema wrote:
> 
> > > So my proposed update for the long queue id is:
> > > 
> > >     - 4 octets of base 32 encoded tv_usec
> > >     - 6+ octets of base 51 encoded tv_sec epoch time
> > >     - one non base 51 octet separator
> > >     - inode number in base 52.
> > 
> > Better: use reverse microseconds + reverse seconds (i.e.  LSB
> > first, base 52 encoded).  Then we can have 52 possible values per
> > character for queue hashing.  With two levels of hashing we get
> > 10x fewer files per directory compared to old queue file names.
> 
> I am not sure that reversing the "us" value is better. With the "us" in
> big-endian format, the output directory is "locally constant" in time,
> so when a burst of mail arrives it is not overly scattered in multiple
> directories, while overall, the deferred queue is still split evenly.

I thought of that. This is not a problem with today's default
configuration where the high-traffic queues (i.e. incoming, active)
are not hashed.

Do you expect that there will be configurations that do hash their
high-traffic queues?

> Also, with a little-endian base 52 queue-id, and hash depth of "2",
> we have 2704 directories to search, while big-endian 32 x 32 takes us
> to 977 directories as compared to 245 directories for big-endian base 16.

Base 52 requires fewer levels of hashing than smaller bases, and
52 with depth 3 or more seems to make little sense. So that is a
limitation in usability.

I could just forget about lexicographical hashing and simply hash
the hexadecimal representation of the microseconds (extracted from
the queue file name and converted from base 52).  With this there
would be no change in file distribution compared to Postfix 2.8.

        Wietse

Reply via email to