Re: Spamassassin not parsing email messages

2012-12-29 Thread RW
On Fri, 28 Dec 2012 21:48:25 -0800 (PST) Sean Tout wrote: > Hi Martin, > > You certainly did not miss anythingbut I did! Being new to > spamassassin, I was only familiar with spamassassin command. which > was awfully slow for a large number of emails. But now that I used > spamc, I'm getting

Re: Spamassassin not parsing email messages

2012-12-29 Thread Martin Gregorie
On Fri, 2012-12-28 at 21:48 -0800, Sean Tout wrote: > I have practically given up on the original > perl code since I'm unable to find out the issue. With spamc, I can get a > decent performance. > IMO, unless you need the extra facilities of amavis-new or one of the other smart wrappers for SA a

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi Martin, You certainly did not miss anythingbut I did! Being new to spamassassin, I was only familiar with spamassassin command. which was awfully slow for a large number of emails. But now that I used spamc, I'm getting 5+ messages per second. Thank you much for the advise. I have practica

Re: Spamassassin not parsing email messages

2012-12-28 Thread Martin Gregorie
On Fri, 2012-12-28 at 16:51 -0800, Sean Tout wrote: > Hi John, > > Thank you much for the help. I have been trying to avoid executing > spamassassin shell commands from perl since it takes a significant amount of > time~=12 seconds for each email. I have tried the below script, which works > but o

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi John, Thank you much for the help. I have been trying to avoid executing spamassassin shell commands from perl since it takes a significant amount of time~=12 seconds for each email. I have tried the below script, which works but of course not in a favorable especially for processing 20,000+ em

Re: Spamassassin not parsing email messages

2012-12-28 Thread John Hardin
On Fri, 28 Dec 2012, Sean Tout wrote: Hi John, Per your response below, here is what I did to confirm it's not a content problem. open (RFILE, $reportfile_name); while(!$folder_reader->end_of_file()) { $email = $folder_reader->read_next_email(); chomp($email); $mail = $spamtest->parse

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi John, Per your response below, here is what I did to confirm it's not a content problem. open (RFILE, $reportfile_name); while(!$folder_reader->end_of_file()) { $email = $folder_reader->read_next_email(); chomp($email); $mail = $spamtest->parse($email); $status = $spamtest->c

Re: Spamassassin not parsing email messages

2012-12-28 Thread John Hardin
On Fri, 28 Dec 2012, Sean Tout wrote: Hi John, I wrote every email read to an output file. The output file is identical to the input file I'm reading the emails from according to diff! The concern is the format of the single mail object being sent to SpamAssassin for scanning. Having the ver

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi John, I wrote every email read to an output file. The output file is identical to the input file I'm reading the emails from according to diff! Regards, -Sean. -- View this message in context: http://spamassassin.1065346.n5.nabble.com/Spamassassin-not-parsing-email-messages-tp102770p102

Re: Spamassassin not parsing email messages

2012-12-28 Thread John Hardin
On Fri, 28 Dec 2012, Sean Tout wrote: That's most likely the case. But I'm not sure what's going in there and how to get rid of it. I tried with and without chomp() but got the same results. below is a snippet with chomp, which I applied before parsing the email with spamassassin. my $spamtest

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi Dave, That's most likely the case. But I'm not sure what's going in there and how to get rid of it. I tried with and without chomp() but got the same results. below is a snippet with chomp, which I applied before parsing the email with spamassassin. my $spamtest = Mail::SpamAssassin->new();

Re: Spamassassin not parsing email messages

2012-12-28 Thread Dave Funk
That implies that what ever mechanism you're using in the original process is adding a blank line (or bare 'nl' or 'cr') to the beginning of the message that you're then handing to SA. Idiot question, are you doing (or not) a "chomp" in the initial read process? On Fri, 28 Dec 2012, Sean Tout

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi Henrik & Jeff, One more input that might shed more light. I copied one of the emails from the above 3 emails into its own file and ran spamassassin from the command line in test mode against it and it worked fine. the command is spamassassin --test-mode < /spamemails/singleemail.spam where si

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi Jeff, You are correct. it's clear Spamassassin is unable to parse the email. so there is something in the email that's causing SpamAssassin to not parse the email, which I'm trying to find out what it is and why! I have tried multiple sources of emails, many of which are from known spam corpus

Re: Spamassassin not parsing email messages

2012-12-28 Thread Jeff Mincy
From: Sean Tout Date: Fri, 28 Dec 2012 01:10:02 -0800 (PST) Hi Henrik, Thank you much for the prompt response and points. I ran the Perl script with the code you pasted below, but still got the same report scores for all emails! by the way, when I also tried to print cont

Re: Spamassassin not parsing email messages

2012-12-28 Thread Sean Tout
Hi Henrik, Thank you much for the prompt response and points. I ran the Perl script with the code you pasted below, but still got the same report scores for all emails! by the way, when I also tried to print contents of the emails using $status->get_content_preview(), I got [...] I'm unable to pri

Re: Spamassassin not parsing email messages

2012-12-28 Thread Henrik K
On Fri, Dec 28, 2012 at 12:45:03AM -0800, Sean Tout wrote: > Hello, > > I wrote a short Perl program that reads email from an existing mbox > formatted file, passes each individual email to Spamassassin for parse and > score, then prints a report for each email. The strange thing is that I keep >