-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello maarten,
Friday, November 7, 2003, 3:09:20 PM, you wrote: mvdB> Following up to myself, since I want to clarify something here... mvdB> Another aspect that is relevant to me (but arguably not to most mvdB> users of SA and I'm aware of that...) is that for me, english is mvdB> not my native language, neither am I a resident of an mvdB> english-speaking country. Reaaly? You communicaate well in this laanguage. (Sorry, :-) ) mvdB> And because of this, my email is mixed; one part is dutch, one part mvdB> is all the mailinglists I try to follow (which are in english). mvdB> But not being a resident, the fact is that for all of my customers mvdB> and myself, ANY mail mentioning mortgages, loans, ejaculation et al mvdB> is a surefire sign of spam. If not it would have mentioned mvdB> hypotheken, leningen and klaarkomen, which are the dutch mvdB> translations. :-) Knowing I'll sound like a broken record (I wonder if my granddaughter will ever hear a broken record...), a) this is what Bayes is perfect for. Teach it that messages with these English words are spam, and it'll do the job for you. b) the scores are built on a general corpus, not a Dutch corpus. Here where we speak and type Californian, I have made some changes to the default scoring rules. Example: > score BAD_CREDIT 3.00 # increased 06/04/03, 06/16, decr 07/09 I increased this score in June because it wasn't catching enough spam for me (my threshold is 9, so I do need to increase some rule scores significantly). I increased it again two weeks later, apparently too much, and so three weeks later I decreased it. I've been happy with the 3.0 score for four months now. You can do the same with any rule which scores English-specific spam too low. Change the score for your system. That flexibility is one of SA's strengths. I've got SA flagging 99.8% and more of all spam, thanks to Bayes, network tests, modified scores, Stearns' blacklists, and personalized rules. Most individual users see no more than one email of spam a week, and many go a month or two without seeing any spam. You can do the same, and it won't take too long for you to get there once you start making progress. mvdB> Well, none of this is your concern of course. But I would really mvdB> really really like if there was a way to have those typical english mvdB> spam-words score way higher than they do now. You've looked at the various rules. You know which rules frustrate you the most. Create your own scoreset, and modify accordingly. mvdB> Could we maybe envision two rulesets, one for english-speaking mvdB> residents and one for non-english speaking residents...? See http://www.exit0.us/index.php/GermanRules and http://www.exit0.us/index.php/BrazilianRules -- join the party. mvdB> In other words, a lot of us get bitten by the fact that "mortgage" in some mvdB> countries, in some contexts can be non-spam but for the rest of us it is a mvdB> surefire sign to be spam. ... Perhaps you can work together with Wolfram and the others, and compose a reasonable alternative scoreset for non-English-language destinations? Bob Menschel -----BEGIN PGP SIGNATURE----- Version: PGP 8.0 iQA/AwUBP6xns5ebK8E4qh1HEQLsswCgwyHlAezFn6/cqmTDTgpVb9nYowkAnj2J Au48DZyNf+WbDOQ7SIW/9kM3 =rV3i -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.Net email sponsored by: ApacheCon 2003, 16-19 November in Las Vegas. Learn firsthand the latest developments in Apache, PHP, Perl, XML, Java, MySQL, WebDAV, and more! http://www.apachecon.com/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk