Andrew Bernard <andrew.bern...@mailbox.org> writes:

> Well you can dynamically increase CPU or RAM or both on Digitalocean
> that I use. You can do it on a temporary basis - but I'm not sure if
> you get charged for a month or on a strict time basis, it's hard to
> find out!. It's not a matter of needing a separate system. My only
> issue is that I am very financially constrained and I can't afford the
> experiment.
>
> But the bigger fish to fry is the issue with the irregularities in the
> mbox archives. I need to study this in depth before trying a load. I
> did have the same problem with similar erratic mbox archives quite
> some years ago but I can't easily recall the solution. Probably just a
> more refined regex to pick up the 'From:' delimiters.

There isn't really much finesse involved.  Messages start at the pattern
"^From ".  Any "From " inside of a message that would end up at the
start of a line is changed to ">From ", so the pattern "^From " should
be foolproof regarding splitting into messages.

I don't remember what happens to "^>From " but consider it most likely
that any "^>*From " inside of a message gets one ">" prepended when put
into an mbox file, and one taken out again when displayed/processed.

-- 
David Kastrup

Reply via email to