On 14/05/25, Kurt Hackenberg (k...@panix.com) wrote: > On Wed, May 14, 2025 at 02:01:11PM +0100, Rory Campbell-Lange wrote: > > > I believe that mutt uses mboxcl2 format for writing new mailboxes. I'd > > be grateful to know if that is corrrect. > > I think that's right, with a small addition: since version 1.9.5 (April 2018), > Mutt has also written Lines: along with Content-Length:. > > I haven't read Mutt's code that writes mbox; I've only looked at its output, > with a little testing. > > For completeness: mboxcl2 means that Mutt adds the header Content-Length: > and does *not* do ">From " escaping of message lines. And these days Mutt > also throws in the header Lines:. > > (The value of Content-Length: is the number of bytes in the body of the > message; the value of Lines: is the number of lines in the body of the > message. Mutt gets both those lengths right, by what I think they should be. > (Not all software does.)) > > Both headers are non-standard -- they're not RFC 822, they were invented to > work around mbox's deficiencies. Effectively those headers are part of mbox > file format.
This is all really helpful information, which I didn't know. Thank you very much for providing it. > You probably know that Mutt can write new mailboxes in any of the four file > formats that it knows. You can use Mutt to convert among those formats. I didn't know that. I see on the man page mutt can save as mbox, MMDF, MH or Maildir using the -m flag. I couldn't see any docs about saving in the different mbox formats, which I understand from https://docs.aspose.com/email/net/email-storage-formats/ to be as follows: * MBOXO: The original format where “From " lines in the email body are quoted with a > character. * MBOXRD: A variant of MBOXO that further extends the quoting method of “From " lines. * MBOXCL: Introduced by the “Classic” MBOX variant where each “From " line is quoted with an ffrom string. * MBOXCL2: A variation of MBOXCL where “From " lines are doubled to distinguish them. > > It would be helpful also to know how long that has been the case since > > I've got some 20 year old mutt mboxes I'm keen to process with a golang > > program. > > I don't know, haven't used Mutt for that long. >From the source in git it seems that Thomas Roessler committed the first man pages for mutt in 1998 and the mbox file format in 2000. There is no mention of MBOXCL format in the mbox man page until Urs Janßen's commit of April 2004; so just over 20 years ago, so I assume I've been by default using that format since a bit after that (given Debian's slowish release cycle). I also realise that while I may need to be aware of the mbox format when parsing emails, (or massaging golang's net/mail module's output) I should simply be looking to separate emails in mboxes by what the mbox man page in docs calls "the postmark line". Curiously none of the mbox parsers I've been using take that approach. I'm going to start with the postmark line (I can't really read C but is_from in from.c makes interesting reading.) I'll then work to decode the different mbox types, starting with MBOXCL/MBOXCL2, as needed. I'm not going to attempt to take on MMDF and MH format, but I've already got a basic Maildir format parser going, which is thankfully pretty easy. Thanks again for the very useful pointers, Rory