On Fri, May 16, 2025 at 05:26:25PM -0400, Kurt Hackenberg wrote: > Nope, sorry. RFC 4155 has a problem. Its default format, the only one it > defines, defines the From_ line rigidly, forbids ">From " escaping, and does > not use a length. It says messages should be found by recognizing the whole > From_ line, with exact syntax. > > That fails when the message body includes such a From_ line, as it might > when people use email to discuss mbox format, as here. Like this: > > From nobody@nowhere.invalid Thu Jan 1 00:00:00 1970 > > An RFC 4155 reader would take that line above as the beginning of a new > message, and would fail to read the rest of this message.
I agree that this is a problem. But I don't agree with the (elided for brevity) suggestion that we should collectively ignore the RFC, because I don't think that serves the future well. There are, and will be, projects that will rely on the documentation (which at the point, for better or worse, is RFC 4155, an archived web page mentioned elsewhere in this thread [1], and some scattered notes) long after mutt is gone. (Why? Because there are enormous repositories of email in mbox format, along with equally enormous repositories of Usenet news articles that have been saved in mbox format. I'm directly aware of one project tackling a corpus of ~400M messages, and indirectly aware of others doing similar things. And more repositories are being created all the time: e.g. this mailing list is run with Mailman, whose primary message store is in mbox format.) I think a better approach is to figure out what needs to be changed/ fixed/added to RFC 4155 so that it covers the variants that arisen, and to create a superseding RFC that updates it. This won't fix all the problems that have arisen as a result of choices (or mistakes) made along the way, but it should at least document those problems so that folks have a fighting chance of dealing with them. I think it'd be good to have a single definitive-as-possible reference stored somewhere that's like to be around for a while, and well, an IETF standard is about as "permanent" as we're likely to get. And yes, I realize that I'm about to be volunteered for this. This is not my first day on the job. ;) And then someone(s) will need to look at formail(1), grepmail(1), and other mail tools to try to figure out what works/breaks with what. I'm already staring at grepmail for other reasons, so I'll make a note to circle back to this issue. ---rsk [1] It's this page: "mbox" is a family of several mutually incompatible mailbox formats. https://web.archive.org/web/20160423115957/http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/mail-mbox-formats.html