Re: shifting the midpoint between the average spam and average ham

Joe Emenaker 3 Sep 2004 22:56:16 -0000

Joe Flowers wrote:

If your "spread" is good and it's just the threshold that needs adjusting, it would be trivial to make a rule that fires on every message and give > it a score equal to the desired difference...

Thanks Pierre. That may be what I have to do, if noone has a better idea.

Actually, what this discussion has inspired me to do is to investigate the idea of having a script auto-adjust each user's spam_threshold.

Currently, I've got a setup where users have two trash folders: one for spam, one for ham. Every hour, a cron job runs sa-learn on the contents of those folders. However, something *else* that it does is it records each message to a "spamlog", which holds the SA spam score and whether or not the user felt that it was spam or not.

Originally, I did it so that I could give users personalized values in a page which would look like this (http://fruitpie.blastpoint.com/~jemenake/spamreport.cgi). However, after reading this thread, I think I'm deciding that this isn't necessary. The user can just indicate what their desired level of false-positives or false-negatives is. Then, my hourly script, after it runs sa-learn and updates the spamlog, it could run some stats on the updated spam log and figure out the best spam_threshold in order to achive the user's desired FP or FN rate.

I'd suggest that you pursue something like that... (or you can wait for me to write mine)

- Joe

--
When freedom gives way to tyranny, it is not because tyranny comes
dressed as a wolf. Rather, it comes dressed as a shepherd,
pointing out other wolves. Go *read* the Patriot Act.

smime.p7s
Description: S/MIME Cryptographic Signature

Re: shifting the midpoint between the average spam and average ham

Reply via email to