On Fri, 2003-11-21 at 14:05, John Manko wrote:
> Hi,
>     What would be that best library/module for parsing a mail file.
>   I want to be able to extract the values for header tags as well as 
> the body(including 'recieved by'). 

John,

Here's a script I wrote to parse mbox files for a spammers email
address.  I then take the results and build SpamAssassin rules from it.

Read the POD's for each module to get a feel for everything you can do
with them.  This should get you started....


#!/usr/bin/perl
#
use warnings;
use strict;
use Mail::MboxParser;
use Email::Find;

my $parseropts = {
        enable_cache => 1,
        enable_grep => 1,
        cache_file_name => 'cache-file',
};

my $mb = Mail::MboxParser->new('probably-spam',
                                decode  => 'ALL',
                                parseropts => $parseropts);

while( my $msg = $mb->next_message) {
        #print $msg->header->{from}, "\n";
        #print $msg->header->{to}, "\n";
        #print $msg->header->{subject}, "\n";
        my $tmpeml = $msg->header->{from};

        find_emails($tmpeml, sub {
                        my $email = pop @_;
                        print $email, "\n";
                        $email;
                });
}


HTH,
Kevin

-- 
Kevin Old <[EMAIL PROTECTED]>


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to