Quoting Bryan Hoover <[EMAIL PROTECTED]>: > The spamassassin run won't be able to use Bayes for testing a mail, as > the debug output says, until there's 200 each of spam, ham. And though > I've only used sa-learn for Bayes training, I assume the linked spamtrap > outline is sound, Bayes learning as expected :) - handy to know > spamassassin options provide for this. > > Ham/spam limits are a factor in terms of using the Bayes analysis in > detection, not training. > > And obviously, the sa-learn run will learn what's sent it.
Taking a closer look, the # of spams in the database is growing from general filtering (it's now at 30 as opposed to 19 earlier.) However, I don't believe the 'learning' is adding to the database, only regular spam filtering. When I send a spam message to the spamtrap, I note the following in the log: 1. Database gets opened in R/O mode, never gets opened in R/W mode. 2. SA says it can't use the bayes database because it has less than 200 messages 3. The database files are not touched after the message has been "learned". ie. the datestamp on all the files in the .spamassassin directory haven't changed. 4. When I fire 2 or 3 spams at the spamtrap one after another, the number of spams in the log doesn't increment. ie. I bounced a spam to the spamtrap, the log said "only 30 spam(s) in bayes DB". I sent another one moments later, it still says 30 in the next entry in the log. The logs do indeed indicate that it's using the correct database. I would have assumed that regardless as to whether or not it's using the database during the learn process, it would need to add the information about the spam that it's "learning" into the database if it's gonna do any good...? *shrug* regards, Paul Quoting Bryan Hoover <[EMAIL PROTECTED]>: > Paul Fielding wrote: > > > > Quoting Bryan Hoover <[EMAIL PROTECTED]>: > > > > > You could set these scripts' spamassassin, sa-learn commands with -D, > > > and use standard error redirection to a text file. The output will tell > > > you which Bayes database it's using. You'd see such like: > > > > I did this and learned a few things. The following went to the log: > > > > debug: using "/home/spamtrap/.spamassassin" for user state dir > > debug: using "/home/spamtrap/.spamassassin/user_prefs" for user prefs > file > > debug: bayes: 2391 tie-ing to DB file > > R/O /home/sharedspam/.spamassassin/bayes_toks > > debug: bayes: 2391 tie-ing to DB file > > R/O /home/sharedspam/.spamassassin/bayes_seen > > debug: bayes: found bayes db version 2 > > debug: bayes: Not available for scanning, only 19 spam(s) in Bayes DB < > 200 > > debug: bayes: 2391 untie-ing > > debug: bayes: 2391 untie-ing db_toks > > debug: bayes: 2391 untie-ing db_seen > > debug: Score set 1 chosen. > > debug: Initialising learner > > > > The good thing is that it appears I'm hitting the correct bayes database. > The > > bit I don't really understand is the part about not being available for > > scanning. I do understand from reading that the bayes database that it's > most > > effective when it's learned a large volume of messages. But how can I have > it > > learn messages if it ignores the spam I'm trying to give it to learn? It > looks > > to me like it's opening the database in Read-Only mode (R/O?), decides the > db > > is too small, and releases the database. Nothing gets written to the > database, > > so I assume nothing gets learned. > > The spamassassin run won't be able to use Bayes for testing a mail, as > the debug output says, until there's 200 each of spam, ham. And though > I've only used sa-learn for Bayes training, I assume the linked spamtrap > outline is sound, Bayes learning as expected :) - handy to know > spamassassin options provide for this. > > Ham/spam limits are a factor in terms of using the Bayes analysis in > detection, not training. > > And obviously, the sa-learn run will learn what's sent it. > > As a follow-up check, you should see the corresponding ham/spam database > counts increase with each test you run. > > Bryan > > > Is it perhaps that the 19 spams it's referring to are spams that the > regular > > rules have caught since I set up the database? If so, then is it fair for > me > > to assume that once it's caught +200 spams via regular rules then it will > start > > actually using the bayes database and allow me to teach it? > > > > Thanks muchly for the help... > > > > regards, > > > > Paul > > > > Quoting Bryan Hoover <[EMAIL PROTECTED]>: > > > > > Paul Fielding wrote: > > > > > > > > I recently set up a shared database with spamtrap and hamtrap accounts, > as > > > per: > > > > > > > > > http://www.stearns.org/doc/spamassassin-setup.current.html#autoreporting > > > > > > > > You can see the details of the procmail and local.cf files at the > link > > > above, > > > > but the sort story is that the database is in > > > /home/sharedspam/.spamassassin, > > > > and accounts spamtrap and hamtrap have their .spamassassin dir linked > back > > > to > > > > it. > > > > > > > > /etc/procmail points everyone to the shared database, and the > .procmail > > > scripts > > > > for spamtrap and hamtrap take the incoming mail, process it though > > > spamassassin > > > > and sa-learn to teach the spam and ham, and then dump the messages > into > > > another > > > > folder for me. > > > > > > You could set these scripts' spamassassin, sa-learn commands with -D, > > > and use standard error redirection to a text file. The output will tell > > > you which Bayes database it's using. You'd see such like: > > > > > > debug: bayes: 1890297 tie-ing to DB file R/O > > > /home/Bryan/.spamassassin/bayes_toks > > > debug: bayes: 1890297 tie-ing to DB file R/O > > > /home/Bryan/.spamassassin/bayes_seen > > > > > > Bryan > > > > > > > > > > > > ------------------------------------------------------- > > > The SF.Net email is sponsored by EclipseCon 2004 > > > Premiere Conference on Open Tools Development and Integration > > > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > > > http://www.eclipsecon.org/osdn > > > _______________________________________________ > > > Spamassassin-talk mailing list > > > [EMAIL PROTECTED] > > > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk > > > > > > > ---------- > > [EMAIL PROTECTED] > > http://www.fielding.ca > > > > ------------------------------------------------- > > This mail sent through IMP: http://horde.org/imp/ > > > > ------------------------------------------------- > > This mail sent through IMP: http://horde.org/imp/ > > > > ------------------------------------------------------- > > The SF.Net email is sponsored by EclipseCon 2004 > > Premiere Conference on Open Tools Development and Integration > > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > > http://www.eclipsecon.org/osdn > > -- > One should be an enigma not just to others but to oneself too. I study > myself. When I'm tired of that I light a cigar to pass the time, and > think: God only knows what the good Lord really meant with me, or what > He meant to make of me. - (Soren Kierkegaard - Either/Or) > > http://www.wecs.com/content.htm > > This signature file is generated by Pick-a-Tag ! > Written by Jeroen van Vaarsel > http://www.google.com/search?hl=en&ie=ISO-8859-1&q=pick-a-tag > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Spamassassin-talk mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk > ------------------------------------------------- This mail sent through IMP: http://horde.org/imp/ ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk