MySQL Student wrote: > Hi, > > >> Do you just want to re-scan the whole mbox and see what rules hit now >> for research reasons? >> > > That's a good start, but I'd like to see if I can break out the ham to > train bayes. > > >> There's no way to (directly) get SA to modify email that's already in an >> mbox file. The mass-check and sa-learn tools can read them, but nothing >> in SA can write to that. However, there might be a utility out there to >> do this (although I'm not aware of any).. >> > > Yeah, that's kind of what I thought. Maybe a program that can split > each message back into an individual file? Would procmail even help > here? Or even a simple shell script that looks for '^From ', redirects > it to a file, runs spamassassin -d on it, then re-runs SA on each > file? I could then concatenate each of them back together and pass it > through sa-learn. >
That sounds like a good plan. If you google around for "mbox split" or "mbox splitter" you can find some sample code out there that does it. It's all just simple code looking for the "^From " boundary.