The setup:
Ubuntu 9.04 Jaunty system running postfix, amavis, clamav, and
spamassassin.
These systems(2) serve as relay and filtering gateways into a
proprietary custom mail system.

Problem 1)
        SpamAssassin won't implement local config changes.

I have tried making changes to /etc/spamassassin/local.cf and creating
/etc/spamassassin/custom_rule.cf but new rules in either are ignored.
Configuration settings in both are also appear to be ignored.

I have run:
        sudo spamassassin -D --lint 2>&1 |less
And the output contains:
        [29713] dbg: config: using "/etc/spamassassin" for site rules
dir
        [29713] dbg: config: read file /etc/spamassassin/65_debian.cf
        [29713] dbg: config: read file /etc/spamassassin/custom_rule.cf
        [29713] dbg: config: read file /etc/spamassassin/local.cf

I HAVE reloaded spamassassin afterwards:
        > sudo /etc/init.d/spamassassin reload
        Reloading SpamAssassin Mail Filter Daemon: spamd.
        
Changes I've made to /etc/spamassassin/local.cf include:

# To exclude my local networks (see below)
        internal_networks !0/0
# To turn off auto learning
        bayes_auto_learn 0
# To ignore soft whitelisiting of trusted hosts
        score ALL_TRUSTED 0.0
# To give a high rating to messages with this subject, a test
        header LOCAL_EVERYTHINGYOUNEED_RULE Subject=~ /Everything you
need, you can find here/i
        score LOCAL_EVERYTHINGYOUNEED_RULE 3.0
        describe LOCAL_EVERYTHINGYOUNEED_RULE Everything you need

My /etc/spamassassin/custom_rule.cf contains the same "Everything you
need" lines as above.

Results --  Messages containing "Everything you need, you can find here"
gets through with these headers added:
        X-Spam-Flag: NO
        X-Spam-Score: -1.44
        X-Spam-Score: -3.037
        X-Spam-Level:
        X-Spam-Status: No, score=-3.037 tagged_above=-9999 required=4
tests=[ALL_TRUSTED=-1.8, AWL=1.362, BAYES_00=-2.599] autolearn=ham

(Please ignore the fact that ALL_TRUSTED and BAYES_00 are subtracting
from the score for now, it's explained in problem 2.  Here, the
LOCAL_EVERYTHINGYOUNEED_RULE doesn't even show up.)

Note that ONE of the X-Spam-Level is generated by a second instance of
SpamAssassin running on the proprietary custom mail system, BUT, the
values in the X-Spam-Status ARE being generated by my gateway.


*  Does anyone have any ideas why my local.cf and custom_rule.cf appear
to be ignored, despite showing up when --linting?


----------

Problem 2)
Setup is above.

In front of these mail gateways is a load balancer.  Because of this,
ALL incoming messages appear to be coming from the load balancer, and
therefore a server on my inside (and trusted) network.
I have removed my trusted network from the amavis "mynetworks" config,
but SpamAssassin still thinks it's trusted, hence the changes attempted
above.

Because a large number of messages were tagged as trusted and let
through, and because autolearning is turned on, Bayes is learning these
messages incorrectly as ham!  (Poisoned ham.)

I AM able to run 
        sa-learn --clear
to clear the database, but BEFORE that and now I get
        sa-learn --dump
        config: path "/home/blee/.spamassassin/user_prefs" is
inaccessible: Permission denied
        ERROR: Bayes dump returned an error, please re-run with -D for
more information

With -D
        sa-learn --dump -D 
We get the following lines that I think are of interest:
                ...
        [30577] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes
from @INC
                ...
        [30577] dbg: conf: finish parsing
        [30577] dbg: plugin:
Mail::SpamAssassin::Plugin::ReplaceTags=HASH(0x90fc5f8) implements
'finish_parsing_end', priority 0
        [30577] dbg: replacetags: replacing tags
        [30577] dbg: replacetags: done replacing tags
        [30577] dbg: Bayes: no dbs present, cannot tie DB R/O:
/home/blee/.spamassassin/bayes_toks
        [30577] dbg: config: score set 1 chosen.
        [30577] dbg: Bayes: no dbs present, cannot tie DB R/O:
/home/blee/.spamassassin/bayes_toks
        ERROR: Bayes dump returned an error, please re-run with -D for
more information


*  How do I find out what is currently in my Bayes database or if it
even exists?



To make this even more complicated, mail headers are added after
SpamAssassin and then the proprietary custom mail system chops messages
up into parts.  Part .0 is the headers, part.1 is the main body, and
parts .2 - .999 are attachments.  SO I don't even have access to the
original email in it's entirety!

*  Is there any way to make Bayes relearn or even just unlearn a message
based on a MessageID or something else?
*  Will it be useful to just feed the message bodies to sa-learn as
--spam without the original headers and mime seperetors?


I've been banging my head on the wall about this for 2 days, so any help
will be greatly appreciated.

--Bryan

Reply via email to