-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Rune Kristian Viken writes: > On Wednesday 14 September 2005 18:34, Bret Miller wrote: > >> We're in the need of checking parts of our outgoing email for > >> spam (read: we've got unknown webmail users.. hugs lots of them, > >> actually.. and some of them have this annoying habit of sending > >> nigeria spam) > >> > >> [considering network tests useless, Bayes excellent, but feels the > >> default weighting may be useless] > >> > >> How do we re-weight the rules, and does anyone have any good > >> suggestions on which checks to use? Also, checking for certain > >> blacklisted URLs in the messages will probably help (Someone recommended > >> SURBL for this) .. but I think a re-weighting will still be in order. > > > > I'd be inclined to try the SARE fraud rules (see www.rulesemporium.com) > > in addition to the SA internal and bayes tests. > > Excellent suggestion! I think we'll try those. > > > If you find that doesn't give you a high enough score, pushing the > > BAYES_99 score a little higher might be in order. > > That was what I was thinking about. Others have mentioned local.cf, which > of course is a good thing (and we've already looked at that, it's covered > quite well in the docs). What I was thinking was using the > 'masses/corpus'-things to generate our own weightings, trying to tune > SpamAssassin for our particular use-case. Not sure if they're meant for > that, though - and very unsure on how to do that. I've not been able to dig > that up through the docs. If it's a bad idea - please do not hesitate to > point it out. > > Also, David B Funk suggested using -L , indicating "No network tests". As > mentioned, I'm cosidering using SURBL. Is it possible to still use SURBL > with -L ? The docs says this is "Use local tests only (no DNS)" and that > seems to be off the mark. I think you *do* want to use SURBL, in which case -L would not be recommended. One possible thing to do is collect some data, namely: - a selection of "good" nonspam outgoing mail - a selection of "bad" outgoing spam attempts If you can do this, you can then build a corpus of mails to test against and manually tweak scores. I don't think you need to go to the bother of generating an entirely new score-set, it should be possible to do this with just a little manual tweaking. Bayes will definitely be helpful, too, and that corpus will provide training data. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFDKajqMJF5cimLx9ARAsS8AKCFU7W92G6S7yd0oLpAa1GCggl6LwCdFLnf pS/Rt0JvWYKPO3ExKLrfWAE= =w2kB -----END PGP SIGNATURE-----