John Andersen wrote:
On Thursday 12 October 2006 14:54, John Rudd wrote:
That rule has a 3.2 value because the 3.2 value is
accurate to differentiating spam vs ham in the corpus. Therefore, the
score is appropriate.
No, its not accurate.
The rule is in-discriminant as to content. It flags ham with the same score
as spam. Therefore by definition it is in-discriminant, and thus useless
as in the prediction of ham vs spam.
Zero that rule's score, and your false positives will fall, but your false
negatives will not increase. The rule unfairly targets ham.
That's completely untrue, comparing my ham and spam. Not just mostly
untrue, but absolutely and completely untrue. I got two HAM messages
with this set (but only this and not enough to filter on) and nearly
every spam either had this or was picked up by SPF or DKIM rules (was a
forged mail from a domain which had a postmaster)
And John, there are metrics used to test this. Implement the testing
environment for yourself, and come up with real metrics before saying
this kind of absolute-statement-no-caveat nonsense.
--
Jo Rhett
Network/Software Engineer
Net Consonance