Alex wrote:
>>> Hi Kris,
>>>
>>> I'm trying to get your extract-data script running, and having some
>>> difficulties. It's dying at the $spamtest->check($mail) call. It just
>>> never returns. What does that function do?

As best I can recall it runs some trailing bits of what you might
reasonably call "message parsing", and at least the first stages of
running rule checks.  I couldn't find a middle ground that only did the
real minimum necessary for extracting the relay IPs and URIs from the
message.

> Yes, I'm sorry, I should have made it more clear that I understood
> that. It correctly instantiates the new SA instance:
> 
> my $spamtest = Mail::SpamAssassin->new();
> 
> then logs in, opens the mbox folder, reads it in,

Just to confirm;  you've changed the IMAP user/password/server to a
suitable account?  The type of storage under the IMAP server (mbox,
maildir, maildir++, database, stone tablets, etc) should be irrelevant;
 it's not exposed to the IMAP client.

> then just sits there
> with the check($mail) function, which must be deep enough in the SA
> code that I'm not familiar with it and hoped someone had some ideas.

I'm afraid it took me some trial and error to work out the minimal
pieces of SA that I actually had to call to get this working - it was
still easier than rebuilding relay parsing and trust-path tracking, and
URI extraction from the message body myself.  I've since forgotten most
of what I learned about the SA internals.  :/

If you set a debug flag on the top-level SA object, you should get a
spew of standard SA debug output that will at least tell you where it
stalls.

A quick eyeball through the spamassassin command-line script shows you
should be able to change that line to:

my $spamtest = Mail::SpamAssassin->new({debug => 1});

to do this.

I wasn't sure before, but this script *does* run your live ruleset
against each message.  I don't know if there's a way to separate the
message parse/deconstruction further from actually running the full
ruleset;  for this script it would be better to *not* to have to run the
actual rules.

> This is for v3.3.2. I thought Kris was using this script regularly,
> and I have a production implementation of SA on this box, so I don't
> understand why it would be failing here.

Nor do I;  I just diff'ed the live script against SVN and there are no
differences anywhere that might affect overall function.  (Mostly in the
list of IP blocks used to trim the IP list output, plus a handful of
cosmetic output fiddles.)

-kgd

Reply via email to