Daniel Quinlan <[EMAIL PROTECTED]> [2002-09-27 21:58:16 -0700]: > You need to GA score the RBL rules to achieve a good FP:FN ratio. > Without GA scoring of the RBLs, you will raise your FPs too much > because the rest of the GA scores are tuned to achieve a good FP:FN > ratio.
I disagree that manually weighting the RBL data at a small value will cause false positives. Effectively RBL data is _extra_ information, not strictly required but helpful. Then I believe the GA should be trained in "offline" mode without RBL information. Then it will have the best FP:FN ration possible on the content of the message. RBLs are really too nonlinear in discrimination space, they are either dead on or dead wrong as has been noted by complaints of collateral damage. When they are wrong they do not make the resulting messages more linearly separable and add noise to the GA scores. > Optimally, we would have different scores for when SA is running > local-only versus local+network, but that improvement is much further > off (if ever). Too much work. Don't waste too much time there. Bob
msg08226/pgp00000.pgp
Description: PGP signature