On Fri, Jan 14, 2000 at 02:21:47PM -0600, Charles Cazabon wrote:
> Russell Nelson <[EMAIL PROTECTED]> wrote:
> >
> > Right, and any scalable email system is going to use NFS. Therefore
> > the question in my mind is not "What should be used for large folders
> > instead of Maildirs?" but instead "What must be done to make Maildirs
> > more efficient"? One way to do that would be for Dan to change the
> > Maildir specification so that a Maildir may have multiple "cur"
> > directories. Then, keep a CDB containing a subset of the message
> > headers.
>
> Doesn't the CDB file then require some trickery to avoid the necessity of
> locks for multiple writers? Locks for the CDB would defeat the main benefit
> of Maildirs. Or perhaps I misunderstand.
cdb are always updated atomically. One can open the cdb, acquiring a safe
path to the file (even if it is updated in between, it will still be reading
the old copy then), read it in, build a new cdb in a tmpfile and rename()
it over the real one. Only risk here is that in very concurrent updates,
one of the two will just miss. This is from the delivery-agent perspective.
When used from a useragent, the story is much easier: read directory listings,
check if all files are in the CDB already, and if they're not, add them.
I have actually been considering such a feature for mutt, since opening a
2500-message Maildir over NFS does take some time with the linux 2.0 client
NFS-implementation ('request' 'ack' 'request' etc., no paralellism) over a
25km glasfibre ethernet to a NetApp. Since cdb-updates are atomic, and in this
case, the updating process actually checks reality [as opposed to reading the
cdb and applying the known-made changes] when updating, so that the cdb
will be a performance improvement, but no PITA. Only glitch I can see is
someone actually editing files in a Maildir and the cdb not catching up..
doing a check on headers when a message is actually opened should fix most
of this, storing a datestamp in the cdb might help also.
Hmm.. I'm discussing user-agent cdb features now... lemme think about this over
the weekend :)
Note that I don't really see the benefit in multiple cur-directories, apart
from the performance advantages on sub-optimal [most] filesystems, for which
same reason the queue directories are split up.
Greetz, Peter.
--
Peter van Dijk - student/sysadmin/ircoper/madly in love/pretending coder
|
| 'C makes it easy to shoot yourself in the foot;
| C++ makes it harder, but when you do it blows your whole leg off.'
| Bjarne Stroustrup, Inventor of C++