Daniel Quinlan <[EMAIL PROTECTED]> [2002-09-27 21:58:16 -0700]:
> You need to GA score the RBL rules to achieve a good FP:FN ratio.
> Without GA scoring of the RBLs, you will raise your FPs too much
> because the rest of the GA scores are tuned to achieve a good FP:FN
> ratio.

I disagree that manually weighting the RBL data at a small value will
cause false positives.  Effectively RBL data is _extra_ information,
not strictly required but helpful.

Then I believe the GA should be trained in "offline" mode without RBL
information.  Then it will have the best FP:FN ration possible on the
content of the message.  RBLs are really too nonlinear in
discrimination space, they are either dead on or dead wrong as has
been noted by complaints of collateral damage.  When they are wrong
they do not make the resulting messages more linearly separable and
add noise to the GA scores.

> Optimally, we would have different scores for when SA is running
> local-only versus local+network, but that improvement is much further
> off (if ever).

Too much work.  Don't waste too much time there.


Attachment: msg08226/pgp00000.pgp
Description: PGP signature

Reply via email to