Re: Which DB is actually used?

jdow Fri, 08 Sep 2006 12:35:11 -0700

From: "Logan Shaw" <[EMAIL PROTECTED]>

On Fri, 8 Sep 2006, Bo Mellberg wrote:
It seems like the exim-users database is being touched regularly, so I'm guessing thatit has been set up by apt-get in some "auto-learning" state.
Yes, you might want to check whatever's running SpamAssassin and
see what user it's running as and also check the configuration
files (probably in /etc/mail/spamassassin) to see where it's
storing the database.
I have earlier trained spam and ham as user "bosse", which is why there is a working dbthere as well.
As I am the only user on my system, it really doesn't matter if I use site-wide or not,but rather how I invoke sa-learn.
Lets say I remove the databases for "bosse" and "root". Is this the proper >> way toinvoke sa-learn:
1. Log on as user "bosse"
2. sa-learn --showdots --sync --dbpath /var/spool/exim4/.spamassassin --spam/home/bosse/Maildir/.MissedSpam/cur
Probably not, or at least not the best way.


Absolutely not. The database under "bosse" is quite apparently not
being used except for his misplaced training. He needs to "su -l exim4"
and then run sa-learn.

(Were it me I'd rip out amavisd-new and put in something that
(IMAO {^,-}) works like procmail. I'd not sure I'd use Exim, either,
unless it can explicitly run spamc as the user "bosse". At the VERY
least I'd read whatever manual existed for amavisd-new and Exim such
that "spamc -u bosse" will work and have spamd access the "bosse"
database. Of course, if spamd is running in a sandbox it can't make
that reach without some skullduggery. So the entire installation needs
to be examined and manipulated so that per user BAYES can work "fer
shure". That's a LOT of RTFM and examine your system configuration,
to be sure. But learning only hurts a little and having learned is
a nice feeling.)

First of all, you need to run sa-learn as the same user that
runs the filtering.  Since you haven't said what user that it
is (whether it's "bosse" or some other user), it's impossible
to say whether that's the correct user to run sa-learn as.


Exactly - and he's not doing that.

If I set up a cron job to do the above I could just toss missed spam into the"MissedSpam"-folder right?


Yeah, but for efficiency reasons, you'd probably not want
messages in that folder to keep accumulating forever, so you'd
probably want a way to purge them after some period of time.
sa-learn can cope with a situation where you feed it the same
message repeatedly with no harm, but it's still a waste of
CPU cycles.


I have ham, spam, oldham, and oldspam entries for my learning process
done via IMAP folders. Once a night spam is learned. When the spam or
ham folder gets more than say a dozen entries I move them over to oldspam
or oldham respectively. That way I keep my old learn database around so
I can rebuild it. Of course, I manually train ONLY. There's none of that
silly autolearn happening here. It's too prone to going off in wild
strange new directions orthoganal (at the very least) to good sense.
Again, YMMV and IMAO liberally apply to the above statements.

{^_^}

Re: Which DB is actually used?

Reply via email to