On Sun, Feb 19, 2006 at 02:20:05AM -0500, Matt Kettler wrote:
> >> How can we keep the spam tagged, and try to mitigate the FPs by keeping
> >> additive scores for multiple URIBLs more moderate? +20 worth of URIBL
> >> hits is fine on spam, but astronomically high scores don't really help
> >> SA when the tagging threshold is +5. However, they do hurt SA when
> >> overlapping mistakes happen.
> 
> Yes.. which is exactly who I was primarily trying to reach by posting
> here on the spamassassin, before this turned into a large
> misunderstanding between the URIBL operators and myself.

I have two things related to this:

1- if the lists are indeed separate (ie: different sources, etc,)
   then having multiple rules makes sense.

2- the end result when generating scores is only as good as the input
   provided.  so if your specific flow of mail has a lot of FPs for various
   rules, you ought to get involved in at least the score generation
   mass-check runs, and preferably the nightly runs (so we'd be able to
   deal with FPs and such during development instead of just by adjusting
   the score).  this philosophy hasn't changed much in the time I've been
   working with SA.

   during score generation, rules that commonly hit the same mails
   (ie: high overlap) tend to have lower individual scores.  this is
   especially true if the ham hit rate is non-zero on the rules.
   however, if you look at the STATISTICS* files, the SURBL rules all
   have a fairly low ham rate which led the perceptron to give the rules
   higher-than-average scores.  so my guess is that the perceptron didn't
   see the issue that has been discussed, or didn't see it enough to have a
   large impact on scores (attempts are made to lower the FP rate, but a >0
   rate is still likely.)

   related to this, I mentioned earlier in the thread about a bug I found
   in the reuse section of mass-check while generating some statistics.
   we used the reuse code to generate the 3.1 scores.  however, due
   to the bug, rule hits were lost.  so it's hard to say exactly what
   occured because of it, but the scores generated for network tests
   (those that enabled reuse anyway) are almost definitely miscalculated,
   and potentially very miscalculated (see the same previous post about
   the "way different" SURBL WS rule hits that I found).


We're trying to get updates going for 3.1, and I'm hoping to get scores
generated more frequently after that's setup.  Perhaps the next set of
scores will address your issue more directly?  Is the problem more that
in the past there weren't a large number of FPs and now there are?

-- 
Randomly Generated Tagline:
"My job is like an airplane pilot's -- when I'm doing it well, you might
 not even notice me, but my mistakes are often quite spectacular."
                                 - Unknown

Attachment: pgpSotyUoYb4W.pgp
Description: PGP signature

Reply via email to