Steffen Daode Nurpmeso <sdao...@googlemail.com> added the comment: Hello Valery Masiutsin, i recently stumbled over this while searching for the link to the standart i've stored in another issue. (Without being logged in, say.) The de-facto standart (http://qmail.org/man/man5/mbox.html) says:
HOW A MESSAGE IS READ A reader scans through an mbox file looking for From_ lines. Any From_ line marks the beginning of a message. The reader should not attempt to take advantage of the fact that every From_ line (past the beginning of the file) is preceded by a blank line. This is however the recent version. The "mbox" manpage of my up-to-date Mac OS X 10.6.7 does not state this, for example. It's from 2002. However, all known MBOX standarts, i.e. MBOXO, MBOXRD, MBOXCL, require proper quoting of non-From_ "From " lines (by preceeding with '>'). So your example should not fail in Python. (But hey - are you sure *that* has been produced by Perl?) You're right however that Python seems to only support the old MBOXO way of un-escaping only plain "From " to/from ">From ", which is not even mentioned anymore in the current standart - that only describes MBOXRD ("(>*From )" -> ">"+match.group(1)). (Lucky me: i own Mac OS X, otherwise i wouldn't even know.) Thus you're in trouble if the unescaping is performed before the split.. This is another issue, though: "MBOX parser uses MBOXO algorithm". ;> - Ciao, Steffen ---------- nosy: +sdaoden _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11728> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com