Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Kris Deugau
RW wrote: On Tue, 09 Mar 2021 08:52:28 -0500 Steve Dondley wrote: I will also be allowing users to flag their own spam using the roundcube webmail client. If you do that you should review the submissions. This. SO much this. ALL THE THIS. If you're using the "Mark as Junk" or "Mark as Jun

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread RW
On Tue, 09 Mar 2021 08:52:28 -0500 Steve Dondley wrote: > On 2021-03-09 08:42 AM, RW wrote: > > > > If you keep a full archive of what's been trained. I think it makes > > sense to trim out old mail occasionally and recreate the database - > > particularly if it's a single user Bayes. > > I

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Bill Cole
On 9 Mar 2021, at 7:49, Steve Dondley wrote: I've read through https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which states that "anything over about 5000 messages does not improve accuracy significantly in our tests." Did you read the section on expiration? https://spamassassi

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Steve Dondley
On 2021-03-09 08:28 AM, Greg Troxel wrote: Steve Dondley writes: I've read through https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which states that "anything over about 5000 messages does not improve accuracy significantly in our tests." I would take that with a grain of salt.

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread RW
On Tue, 09 Mar 2021 07:49:38 -0500 Steve Dondley wrote: > I've read through > https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which > states that "anything over about 5000 messages does not improve > accuracy significantly in our tests." > > So once I hit 5,000, what do? Do I run -

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Greg Troxel
Steve Dondley writes: > I've read through > https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which > states that "anything over about 5000 messages does not improve > accuracy significantly in our tests." I would take that with a grain of salt. Based on my experience running SA fo

Training spamassassin past 5,000 emails

2021-03-09 Thread Steve Dondley
I've read through https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which states that "anything over about 5000 messages does not improve accuracy significantly in our tests." So once I hit 5,000, what do? Do I run --forget on say the 500 oldest emails, delete those from my ham/spa

Re: Training Spamassassin

2006-08-24 Thread Edward Diener
John D. Hardin wrote: On Thu, 24 Aug 2006, Edward Diener wrote: Is this true ? Am I supposed to be putting copies of messages which Spamassassin has not marked as spam and which are not spam into my 'ham-to-learn' folder, as opposed to messages which Spamassassin has erroneously marked as spam

Re: Training Spamassassin

2006-08-24 Thread John D. Hardin
On Thu, 24 Aug 2006, Edward Diener wrote: > Is this true ? Am I supposed to be putting copies of messages > which Spamassassin has not marked as spam and which are not spam > into my 'ham-to-learn' folder, as opposed to messages which > Spamassassin has erroneously marked as spam ? That is true.

Training Spamassassin

2006-08-24 Thread Edward Diener
For my IMAP mail account my e-mail host has setup Spamassassin to be automatically trained by using a 'spam-to-learn' and 'ham-to-learn' IMAP folders for my mailbox on the server. I had assiduously been moving messages not already marked as [SPAM] by Spamassassin into the 'spam-to-learn' folder

RE: more questions on training spamassassin

2006-04-25 Thread Webmaster
> -Original Message- > From: Matt Kettler [mailto:[EMAIL PROTECTED] > Sent: April 25, 2006 11:30 AM > To: [EMAIL PROTECTED] > Cc: users@spamassassin.apache.org > Subject: Re: more questions on training spamassassin > > Webmaster wrote: > > In my setup, t

RE: more questions on training spamassassin

2006-04-25 Thread Webmaster
> -Original Message- > From: Theo Van Dinter [mailto:[EMAIL PROTECTED] > Sent: April 25, 2006 10:36 AM > To: Spamassassin Users List > Subject: Re: more questions on training spamassassin > > On Tue, Apr 25, 2006 at 10:30:47AM -0700, Webmaster wrote: > &

Re: more questions on training spamassassin

2006-04-25 Thread Matt Kettler
Webmaster wrote: > In my setup, the server running spamassassin is different than the server > delivering the final e-mail. This means a few extra headers will be added > by the time the clients see the e-mail. If I were to take this e-mail and > train spamassassin, it is no longer in the form th

Re: more questions on training spamassassin

2006-04-25 Thread Theo Van Dinter
On Tue, Apr 25, 2006 at 10:30:47AM -0700, Webmaster wrote: > In my setup, the server running spamassassin is different than the server > delivering the final e-mail. This means a few extra headers will be added > by the time the clients see the e-mail. > So my question is, is it even worthwhile to

more questions on training spamassassin

2006-04-25 Thread Webmaster
In my setup, the server running spamassassin is different than the server delivering the final e-mail. This means a few extra headers will be added by the time the clients see the e-mail. If I were to take this e-mail and train spamassassin, it is no longer in the form that spamassassin sees orig

Re: training SpamAssassin without updating bayes*

2006-03-05 Thread Theo Van Dinter
On Sat, Mar 04, 2006 at 09:56:14PM -0500, Gabriel Wachman wrote: > During training I run: > sa-learn --dbpath $WORKDIR --ham $DATADIR/$message_dir > (likewise for spam) > > During testing I run: > spamassassin -t -p $PREFSPATH $DATADIR/$message_dir You may want to look into mass-check. It's much

Re: training SpamAssassin without updating bayes*

2006-03-05 Thread mouss
Gabriel Wachman a écrit : > > Yes. I know it may sound strange from some people's perspective, but > there are good reasons we need to do it this way. We are comparing > several spam filters; in order to make claims about the performance of > any of the filters we need to evaulate a _fixed_ classi

Re: training SpamAssassin without updating bayes*

2006-03-05 Thread Theo Van Dinter
On Sat, Mar 04, 2006 at 10:50:19PM -0500, Daryl C. W. O'Shea wrote: > Even with bayes_auto_learn disabled, the tokens' atimes are still > updated. That's the way SpamAssassin works. That's what helps > SpamAssassin's bayes implementation in being effective. Well, sort of. The atime updates ar

Re: training SpamAssassin without updating bayes*

2006-03-05 Thread Gabriel Wachman
Daryl C. W. O'Shea wrote: On 04/03/06 09:56 PM, Gabriel Wachman wrote: A colleague and I are writing a paper about a spam filter he developed. We'd like to compare it against various open source filters, including SpamAssassin. The methodology we are using is to train the filter on a set of mes

Re: training SpamAssassin without updating bayes*

2006-03-05 Thread jdow
From: "mouss" <[EMAIL PROTECTED]> Gabriel Wachman a écrit : A colleague and I are writing a paper about a spam filter he developed. We'd like to compare it against various open source filters, including SpamAssassin. The methodology we are using is to train the filter on a set of messages, and

Re: training SpamAssassin without updating bayes*

2006-03-05 Thread mouss
Gabriel Wachman a écrit : > A colleague and I are writing a paper about a spam filter he developed. > We'd like to compare it against various open source filters, including > SpamAssassin. The methodology we are using is to train the filter on a > set of messages, and then test it on an independent

Re: training SpamAssassin without updating bayes*

2006-03-04 Thread Daryl C. W. O'Shea
On 04/03/06 09:56 PM, Gabriel Wachman wrote: A colleague and I are writing a paper about a spam filter he developed. We'd like to compare it against various open source filters, including SpamAssassin. The methodology we are using is to train the filter on a set of messages, and then test it on a

Re: training SpamAssassin without updating bayes*

2006-03-04 Thread Loren Wilton
> During testing, I can see spamassassin create a "bayes_journal" file and > write to it continuously. I understand this is spamassassin's way of If the journal is only growing it isn't being learned from. Typically at some point if auto-learn were enabled one of the spam mail runs would take som

training SpamAssassin without updating bayes*

2006-03-04 Thread Gabriel Wachman
A colleague and I are writing a paper about a spam filter he developed. We'd like to compare it against various open source filters, including SpamAssassin. The methodology we are using is to train the filter on a set of messages, and then test it on an independent set of messages. The key is that

RE: question on training spamassassin

2006-03-02 Thread Webmaster
> -Original Message- > From: Matt Kettler [mailto:[EMAIL PROTECTED] > Sent: March 2, 2006 8:53 AM > To: [EMAIL PROTECTED] > Cc: users@spamassassin.apache.org > Subject: Re: question on training spamassassin > > Webmaster wrote: > > >> Also if your

Re: question on training spamassassin

2006-03-02 Thread Matt Kettler
Webmaster wrote: >> Also if your users are only or mostly forwarding spam, SA's >> bayes is going to have a bayes bias that all messages >> forwarded by your mail clients are spam, regardless of content. >> >> > > Does this also mean that it is almost useless to share bayes from > one server t

RE: question on training spamassassin

2006-03-02 Thread Bowie Bailey
Webmaster wrote: > From: Matt Kettler [mailto:[EMAIL PROTECTED] > > > > Jeff Portwine wrote: > > > Hmm.. I don't quite understand this.At my company, we forward > > > any spam that gets through to [EMAIL PROTECTED] and any ham marked > > > as spam to [EMAIL PROTECTED] ... this was set up long

RE: question on training spamassassin

2006-03-01 Thread Webmaster
> -Original Message- > From: Matt Kettler [mailto:[EMAIL PROTECTED] > Sent: February 27, 2006 5:18 PM > To: Jeff Portwine > Cc: [EMAIL PROTECTED]; users@spamassassin.apache.org > Subject: Re: question on training spamassassin > > Jeff Portwine wrote: > >

Re: question on training spamassassin

2006-02-28 Thread jdow
al Message - From: "Loren Wilton" <[EMAIL PROTECTED]> To: ; "'Theo Van Dinter'" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, February 28, 2006 1:09 AM Subject: Re: question on training spamassassin Thanks. I read the wiki. This is unfortunate because

Re: question on training spamassassin

2006-02-28 Thread Kris Deugau
Webmaster wrote: This is unfortunate because many clients are still using this client: "Microsoft Outlook Express: It does not appear to have a redirect option" Hmm. OE is actually one of the better clients for retrieving a "true" copy of the original message that was downloaded via POP3. A

Re: question on training spamassassin

2006-02-28 Thread Jeff Portwine
- Original Message - From: "Loren Wilton" <[EMAIL PROTECTED]> To: ; "'Theo Van Dinter'" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, February 28, 2006 1:09 AM Subject: Re: question on training spamassassin Thanks. I read the wiki. This is unf

Re: question on training spamassassin

2006-02-27 Thread Loren Wilton
> Thanks. I read the wiki. > This is unfortunate because many clients are still using this client: > "Microsoft Outlook Express: It does not appear to have a redirect option" True statement, but not necessarily important. I'm running OE, and I have spam and ham mb's set up on the Linux box as im

Re: question on training spamassassin

2006-02-27 Thread Matt Kettler
Jeff Portwine wrote: > Hmm.. I don't quite understand this.At my company, we forward any > spam that gets through to [EMAIL PROTECTED] and any ham marked as spam to > [EMAIL PROTECTED] ... this was set up long ago before I even started > working here and the spam filter worked really well. Re

RE: question on training spamassassin

2006-02-27 Thread Webmaster
> -Original Message- > From: Theo Van Dinter [mailto:[EMAIL PROTECTED] > Sent: February 27, 2006 11:18 AM > To: users@spamassassin.apache.org > Subject: Re: question on training spamassassin > > On Mon, Feb 27, 2006 at 02:14:22PM -0500, Jeff Portwine wrote: >

RE: question on training spamassassin

2006-02-27 Thread Webmaster
> -Original Message- > From: Matt Kettler [mailto:[EMAIL PROTECTED] > Sent: February 27, 2006 11:30 AM > To: [EMAIL PROTECTED] > Cc: users@spamassassin.apache.org > Subject: Re: question on training spamassassin > > Webmaster wrote: > > A large numbe

Re: question on training spamassassin

2006-02-27 Thread Jeff Portwine
e it does work for us, but you're saying it shouldn't ? - Original Message - From: "Matt Kettler" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: Sent: Monday, February 27, 2006 2:29 PM Subject: Re: question on training spamassassin Webmaster wrote: A

Re: question on training spamassassin

2006-02-27 Thread Matt Kettler
Webmaster wrote: > A large number of our clients are using POP. > If I were to ask them to send false negatives to [EMAIL PROTECTED] > and false positives to [EMAIL PROTECTED] so I can place them in > a folder and train, does that hinder the training process in > anyway knowing that the header

Re: question on training spamassassin

2006-02-27 Thread Theo Van Dinter
On Mon, Feb 27, 2006 at 02:14:22PM -0500, Jeff Portwine wrote: > I'm a SA newbie myself, but I believe I've read that all the headers, etc, > are stripped before the learning takes place, so it should work fine for > you to have your users go ahead and do that for training. > > Somebody here wi

Re: question on training spamassassin

2006-02-27 Thread Jeff Portwine
riginal Message - From: "Webmaster" <[EMAIL PROTECTED]> To: Sent: Monday, February 27, 2006 1:45 PM Subject: question on training spamassassin A large number of our clients are using POP. If I were to ask them to send false negatives to [EMAIL PROTECTED] and false positives to [EMAIL

question on training spamassassin

2006-02-27 Thread Webmaster
A large number of our clients are using POP. If I were to ask them to send false negatives to [EMAIL PROTECTED] and false positives to [EMAIL PROTECTED] so I can place them in a folder and train, does that hinder the training process in anyway knowing that the header info is changed with the f

RE: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Joe Polk
quot; <[EMAIL PROTECTED]> Sent: Fri, 4 Feb 2005 19:47:40 +0100 Subject: RE: Manually training SpamAssassin by forwarding mail > > -Original Message- > > From: Stuart Johnston [mailto:[EMAIL PROTECTED] > > Sent: Friday, February 04, 2005 7:35 PM > > To: Peter Ma

RE: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Sander Holthaus - Orange XL
> -Original Message- > From: Stuart Johnston [mailto:[EMAIL PROTECTED] > Sent: Friday, February 04, 2005 7:35 PM > To: Peter Marshall; SpamAssassin Users > Subject: Re: Manually training SpamAssassin by forwarding mail > > Peter Marshall wrote: >

Re: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Stuart Johnston
Peter Marshall wrote: Stuart Johnston wrote: Peter Marshall wrote: Kevin Sullivan wrote: --On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote: I've been interested in offering customers to train manually train the SpamAssassin Bayes filter for ham and spam (to reduce false positives and

Re: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Peter Marshall
Stuart Johnston wrote: Peter Marshall wrote: Kevin Sullivan wrote: --On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote: I've been interested in offering customers to train manually train the SpamAssassin Bayes filter for ham and spam (to reduce false positives and negatives). However, I

RE: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Sander Holthaus - Orange XL
> -Original Message- > From: Stuart Johnston [mailto:[EMAIL PROTECTED] > Sent: Friday, February 04, 2005 5:20 PM > To: users@spamassassin.apache.org > Cc: Peter Marshall > Subject: Re: Manually training SpamAssassin by forwarding mail > > Peter Marshall wrote:

Re: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Stuart Johnston
Peter Marshall wrote: Kevin Sullivan wrote: --On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote: I've been interested in offering customers to train manually train the SpamAssassin Bayes filter for ham and spam (to reduce false positives and negatives). However, I can only find document

RE: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Sander Holthaus - Orange XL
> --On 02/04/05 16:08:53 +0100 Sander Holthaus - Orange XL wrote: > > Basically, I've got two option. All mail that is received > is backupped > > on the mailserver before adding any headers. I could match > those with > > mail received in the spam-learn and ham-learn accounts. > However, mail

RE: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Kevin Sullivan
--On 02/04/05 16:08:53 +0100 Sander Holthaus - Orange XL wrote: Basically, I've got two option. All mail that is received is backupped on the mailserver before adding any headers. I could match those with mail received in the spam-learn and ham-learn accounts. However, mail is backupped only for a

RE: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Sander Holthaus - Orange XL
> --On 02/04/05 09:17:55 -0400 Peter Marshall wrote: > > My question is the same as Henrik, I have a bunch of email that is > > spam (either tagged by spam assassin or not tagged at all. > I forwared > > it as an attachment to a "spam" mail box. What do I have to do now > > before I can get b

Re: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Kevin Sullivan
--On 02/04/05 09:17:55 -0400 Peter Marshall wrote: My question is the same as Henrik, I have a bunch of email that is spam (either tagged by spam assassin or not tagged at all. I forwared it as an attachment to a "spam" mail box. What do I have to do now before I can get bayes to learn the messag

Re: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Peter Marshall
Kevin Sullivan wrote: --On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote: I've been interested in offering customers to train manually train the SpamAssassin Bayes filter for ham and spam (to reduce false positives and negatives). However, I can only find documentation to this for loca

Re: Manually training SpamAssassin by forwarding mail

2005-02-04 Thread Kevin Sullivan
--On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote: I've been interested in offering customers to train manually train the SpamAssassin Bayes filter for ham and spam (to reduce false positives and negatives). However, I can only find documentation to this for local mailboxes and IMAP. M

RE: Manually training SpamAssassin by forwarding mail

2005-02-03 Thread Sander Holthaus - Orange XL
> At 07:59 PM 2/2/2005, Sander Holthaus - Orange XL wrote: > >I've been interested in offering customers to train manually > train the > >SpamAssassin Bayes filter for ham and spam (to reduce false > positives > >and negatives). However, I can only find documentation to this for > >local mailb

Re: Manually training SpamAssassin by forwarding mail

2005-02-03 Thread Matt Kettler
At 07:59 PM 2/2/2005, Sander Holthaus - Orange XL wrote: I've been interested in offering customers to train manually train the SpamAssassin Bayes filter for ham and spam (to reduce false positives and negatives). However, I can only find documentation to this for local mailboxes and IMAP. Most

Re: Manually training SpamAssassin by forwarding mail

2005-02-03 Thread Will Yardley
On Thu, Feb 03, 2005 at 01:59:21AM +0100, Sander Holthaus - Orange XL wrote: > I've been interested in offering customers to train manually train the > SpamAssassin Bayes filter for ham and spam (to reduce false positives and > negatives). However, I can only find documentation to this for local >

Manually training SpamAssassin by forwarding mail

2005-02-03 Thread Sander Holthaus - Orange XL
I've been interested in offering customers to train manually train the SpamAssassin Bayes filter for ham and spam (to reduce false positives and negatives). However, I can only find documentation to this for local mailboxes and IMAP. Most users however, retrieve their mail through POP and us