On 24 Jul 2003, Raul Dias wrote: > Something I found interesting was using Razor + Pyzor + Dcc. > Then I create meta rules that matchs: > Razor + Pyzor > Razor + Dcc > Pyzor + Dcc > Razor + Pyzor + Dcc (this will also, of course, match all three before).
This is definitely an interesting idea! But is it a good one? Perhaps (indeed probably) but one has to answer the following question before one is sure. Can anyone here answer it? (I've made the question a bit wordy but the idea is simple). This idea above basically says "make it much more likely to kill the email if it's on more than one spam blacklist site". We have already established in this thread that it is probably *not* a good idea in general to say "if razor says it's spam with probability > 90% then it's spam" because razor can make mistakes, or perhaps be tricked into making mistakes. Indeed the _point_ of spamassassin is that it's giving you a whole host of other tests on top of razor. Similarly we should not say "if pyzor says it's spam then it's spam" and so on. So we are aware of the possibility that each of Razor, Pyzor and Dcc are capable of making mistakes. [By "mistake" I mean here "saying it's spam when it's not", I'm not getting into the issue of saying it's not spam when it is.] So it seems to me that the _key_ issue here is: are razor, pyzor, dcc making mistakes with the _same_ emails? Does anyone know enough about the decision processes going on in more than one of these systems to be able to state confidently that the chances that errors will be made are essentially independent? If they are independent then maybe the idea above is good. But if they are not then the idea above might make things worse. Here's a concrete example. Let's take an email that razor mistakenly says is spam. *Given that this has happened*, what are the chances that pyzor mistakenly says it's spam? If the chances are much much higher than the usual chance that pyzor mistakenly calls a ham email spam, then probably you do _not_ want to give "razor + pyzor" a high score at all because it will lead to more false positives. But if the chances are roughly the same that pyzor makes a mistake, independent of whether razor makes a mistake, then giving razor+pyzor a high score is a terrific idea. I raise this question here because I have no idea of the algorithms razor, pyzor and dcc use. I'm basically asking "are they the same"? e.g. are they all using spamassassin with razor,pyzor,dcc turned off? :-) That would be catastrophic! Kevin PS sorry to go on for so long. I'm just making a simple observation on conditional probabilities, but I think the answer is important for deciding whether the suggestion above is "valid". ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk