Review of email folder formats (was Re: How to save message in non-maildir format?)

Bennett Todd Thu, 22 Jul 1999 07:55:02 -0700
1999-07-19-21:48:54 Chris Gushue:
> Is the maildir format the one where each message is a seperate file? Or is
> that the MH format...

MH and Maildir are both formats that store each message in a separate file.

MH is used by the MH suite of tools; it seems to me like the nmh suite has
kinda replaced it, seems to be getting the maintenance these days, but I'm
really not sure; I've never really gotten in to MH. In any case, it's a much
older format, and due to the way the files are named, it absolutely requires
effective locking if multiple processes (e.g. procmails doing local mail
delivery and mail user agents responding to users' requests to move or delete
messages) are going to safely be writing the folder at the same time.

Maildir is the folder format invented by djb, as part of the qmail project;
native support is available in an increasing number of mail-handling tools
these days, including Postfix, mutt, and Real Soon Now procmail; patches are
available to add support for Maildir to a great many more tools, including
current production versions of procmail.

MH format stores messages in a simple directory; the messages are in files
named "1", "2", "3", etc. That alone is enough to guarantee that locking is
required; else two simultaneous writers could both come up with the same
filename to use for a new file. And a reader has no way to know when a writer
has finished adding a file. There's also a ".mh_sequences" file in there
somewhere, I don't know for sure the exact role it plays. I just played a bit
with an MH folder using mutt, and the result of deleting a message seems to be
to rename it ",#", i.e. prepending a comma to the filename.

Maildir folders are way more complex to look at, but as long as you don't look
at 'em they are way simpler to use. A Maildir is a directory structure, with 3
subdirectories under it. So where an MH folder named "foo" might look like

        foo/.mh_sequences
        foo/1
        foo/2
        foo/3
        ...

the Maildir folder with similar contents might look like

        foo/tmp/
        foo/new/
        foo/cur/926561644.2332_17.ritz.mordor.net:2,S
        foo/cur/927128635.5810_129.ritz.mordor.net:2,S
        foo/cur/927129543.5810_132.ritz.mordor.net:2,S
        ...

When writing a Maildir, the message is first written to a file in tmp/, then
flushed and closed, and only then renamed into new/. Mail User Agents pick up
new traffic from new/ and rename it into cur/, appending a ":" and some
characters that indicate the status (e.g. replied-to). The files are named by
building up a string with the time, the pid of the writing process, a
sequence number used when the same process writes multiple files, and the
hostname; such a filename is universally unique. This design makes Maildir
safe with no locking required at all, even over NFS.

One caveat is that Maildir is a non-starter on OSes with strict limitations on
filenames, e.g. classic CP/M with it's 8+3 character filenames. So the key is
to avoid using original versions of CP/M, or OSes just like 'em.

-Bennett
Review of email folder formats (was Re: How to save message in non-maildir format?)

Reply via email to