Tassilo Von Parseval has a module that I am using called Mail::MboxParser. He actually looks at this list, and even helped me when I had problems with installation (see old thread, around 8/7/03, "Sick of installing CPAN modules"). In fact, I bet he will answer this post as well.
It's very easy to use, and might be worth a look for your situation. In my experience, the current module is quite fast. I have a current thread open called "beginner trying to parse a piece of mail" that might be useful just to check for basic usage. Also the man pages have a brief example of use, that is pretty straightforward. "K Old" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Hello everyone, > > I am using SpamAssassin to determine what is SPAM, and what isn't on my > server. Everything works great and two files are written/appended based > on if the mail is spam - almost-certainly-spam and probably-spam. Given > that the majority of the mail that makes it in these files is spam, > every now and then a valid message will be tagged as spam and I need to > restore it. So, I look through these files to verify that all of the > emails are indeed spam. Thing is, it's time consuming. > > I'm writing a script that will strip out the From, Subject and > X-Spam-Status and all that is fine. The kicker is that when > SpamAssassin writes the messages (in mbox format) to the file, it writes > two. One containing the SpamAssassin flags, etc. and the other is the > original message which is left untouched so that restoring it is easy. > > With this said the file has the SpamAssassin message first, then the > original message, so in my script trying to grep for ^From: I end up > getting duplicate lines. I'd like to be able to "remove" the duplicates > so that I only get the something like the following: > > From: [EMAIL PROTECTED] > Subject: an AWESOME!!!! DEAL!!!! > X-Spam-Status: ************************* > > I've looked at a few modules on CPAN, but haven't parsed mbox files > before, and would like suggestions. From what I understand if I can > just get every other message I'll get what I need. > > Any advice/suggestions? > > Thanks, > Kevin > > > -- > K Old <[EMAIL PROTECTED]> > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]