Re: shifting the midpoint between the average spam and average ham

2004-09-05 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Joe Flowers writes: > > You make a valid point in that, if graphed separately, ham and spam > should show up as two separate curves on a graph. > > > However, there *is* overlap, > > Yes, I expect overlap or SA would be perfect with no FPs or FNs

Re: shifting the midpoint between the average spam and average ham

2004-09-04 Thread Joe Flowers
> You make a valid point in that, if graphed separately, ham and spam should show up as two separate curves on a graph. > However, there *is* overlap, Yes, I expect overlap or SA would be perfect with no FPs or FNs. > and spam and ham (separately, or together) scores are *not* normally distribut

Re: shifting the midpoint between the average spam and average ham

2004-09-04 Thread Joe Flowers
My anti-spam system design went something like this (I integrated NetMail running on Novell NetWare to SpamAssassin running on SuSe or RedHat Linux): 1. To me, it's seems like most of the "action" in SpamAssassin (by default), occurs around the Mail::SpamAssassin::PerMsgStatus::get_hits = 5.0

Re: shifting the midpoint between the average spam and average ham

2004-09-04 Thread Joe Emenaker
Justin Mason wrote: that sounds pretty cool. suggestion: get it to record what rules hit and what those rules' scores were. Actually, I'm already doing it for Bayes. When I turned off autolearning and went solely with manual-training of the Bayes db, I was interested to see if Bayes, alone, w

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Joe Emenaker writes: > Joe Flowers wrote: > > >> If your "spread" is good and it's just the threshold that needs > >> adjusting, it would be trivial to make a rule that fires on every > >> message and give > it a score equal to the desired differen

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Joe Emenaker
Joe Flowers wrote: If your "spread" is good and it's just the threshold that needs adjusting, it would be trivial to make a rule that fires on every message and give > it a score equal to the desired difference... Thanks Pierre. That may be what I have to do, if noone has a better idea. Actually,

Re[2]: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Robert Menschel
Hello Joe, Friday, September 3, 2004, 7:01:12 AM, you wrote: >> why do you need to alter the average scores of ham/spam? JF> What a horrible horrible mess if we can't! Sorry, I don't understand. JF> One example: JF> All of my users have set their "optimal" spam thresholds to some number JF> b

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Nix
On Fri, 03 Sep 2004, Joe Flowers yowled: > When I say "ham and spam curves", I'm envisioning 2 bell curves on the same > graph, significantly separated - I hope, and SA > automatically/continually keeping "5.0" sitting right in the middle between > their peaks. The GA (in 2.x) or perceptron (in

RE: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Bret Miller
> I'm fairly new to using SA, (I was using a pure Razor2 setup > until recently) > and this is the first mention I've heard of a GA to adjust > the scores on the > rules. Can you point me to any documentation of this? I've checked the > website, and I don't see anything there. Actually, it's part

Re: shifting the midpoint between the average spam and average ham scores back to 5.0

2004-09-03 Thread Ryan Thompson
Joe Flowers wrote to users@spamassassin.apache.org: Help please! If the average spam score of all of my ham messages is 1.0 and the average spam score of all of my spam messages is 3.0, then what is the best way to move the average_of_ these_two_averages (2.0) back up to 5.0? The result being th

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Ulysses Cruz
On Fri, Sep 03, 2004 at 10:29:32AM -0400, Matt Kettler whispered: > SA's scores are assigned by a genetic algorithm that evolves out the best > scores for all the rules as one gigantic simultaneous equation. It tunes > this equation to get the most email correctly placed into the spam and ham >

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Joe Flowers
If your "spread" is good and it's just the threshold that needs adjusting, it would be trivial to make a rule that fires on every message and give > it a score equal to the desired difference... Thanks Pierre. That may be what I have to do, if noone has a better idea. BUT, that does imply that I

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Matt Kettler
At 10:01 AM 9/3/2004 -0400, Joe Flowers wrote: One example: All of my users have set their "optimal" spam thresholds to some number between 0.0 and 10.0. If the SA developers correctly shift around test scores, add new and/or improved algorithms, etc., and I need to take advantage of the latest,

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Duncan Findlay
On Fri, Sep 03, 2004 at 10:09:38AM -0400, Pierre Thomson wrote: > If your "spread" is good and it's just the threshold that needs adjusting, it > would be trivial to make a rule that fires on every message and give it a > score equal to the desired difference... Or multiply all the scores by a c

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Joe Flowers
Hey Steve, I was hoping not to do it that way because besides putting the human mistake-prone factor back in, it skews and warps the heck out of the spam and ham curves that the SA developers have worked so hard to get near perfect and trumps their priceless knowledge and experience. When I say

RE: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Pierre Thomson
PROTECTED] Sent: Friday, September 03, 2004 10:01 AM To: users@spamassassin.apache.org Subject: Re: shifting the midpoint between the average spam and average ham > why do you need to alter the average scores of ham/spam? What a horrible horrible mess if we can't! One example: All of

Re: shifting the midpoint between the average spam and average ham scores back to 5.0

2004-09-03 Thread Steve Bertrand
> Help please! > > If the average spam score of all of my ham messages is 1.0 and the > average spam score of all of my spam messages is 3.0, then what is the > best way to move the average_of_ these_two_averages (2.0) back up to > 5.0? > > The result being that I need my current average score for

Re: shifting the midpoint between the average spam and average ham

2004-09-03 Thread Joe Flowers
> why do you need to alter the average scores of ham/spam? What a horrible horrible mess if we can't! One example: All of my users have set their "optimal" spam thresholds to some number between 0.0 and 10.0. If the SA developers correctly shift around test scores, add new and/or improved algorit

Re: shifting the midpoint between the average spam and average ham scores back to 5.0

2004-09-03 Thread Adam Lanier
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Joe Flowers wrote: | Help please! | | If the average spam score of all of my ham messages is 1.0 and the | average spam score of all of my spam messages is 3.0, then what is the | best way to move the average_of_ these_two_averages (2.0) back up to 5.0?

shifting the midpoint between the average spam and average ham scores back to 5.0

2004-09-03 Thread Joe Flowers
Help please! If the average spam score of all of my ham messages is 1.0 and the average spam score of all of my spam messages is 3.0, then what is the best way to move the average_of_ these_two_averages (2.0) back up to 5.0? The result being that I need my current average score for ham messages