-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello maarten,

Friday, November 7, 2003, 3:09:20 PM, you wrote:

mvdB> Following up to myself, since I want to clarify something here...

mvdB> Another aspect that is relevant to me (but arguably not to most
mvdB> users of SA and I'm aware of that...) is that for me, english is
mvdB> not my native language, neither am I a resident of an
mvdB> english-speaking country.

Reaaly?  You communicaate well in this laanguage.  (Sorry, :-)  )

mvdB> And because of this, my email is mixed; one part is dutch, one part
mvdB> is all the mailinglists I try to follow (which are in english).
mvdB> But not being a resident, the fact is that for all of my customers
mvdB> and myself, ANY mail mentioning mortgages, loans, ejaculation et al
mvdB> is a surefire sign of spam.  If not it would have mentioned
mvdB> hypotheken, leningen and klaarkomen, which are the dutch
mvdB> translations. :-)

Knowing I'll sound like a broken record (I wonder if my granddaughter
will ever hear a broken record...),

a) this is what Bayes is perfect for. Teach it that messages with these
English words are spam, and it'll do the job for you.

b) the scores are built on a general corpus, not a Dutch corpus. Here
where we speak and type Californian, I have made some changes to the 
default scoring rules. Example:
> score   BAD_CREDIT  3.00  # increased 06/04/03, 06/16, decr 07/09
I increased this score in June because it wasn't catching enough spam for
me (my threshold is 9, so I do need to increase some rule scores
significantly). I increased it again two weeks later, apparently too
much, and so three weeks later I decreased it. I've been happy with the
3.0 score for four months now.

You can do the same with any rule which scores English-specific spam too
low. Change the score for your system. That flexibility is one of SA's
strengths.

I've got SA flagging 99.8% and more of all spam, thanks to Bayes, network
tests, modified scores, Stearns' blacklists, and personalized rules. Most
individual users see no more than one email of spam a week, and many go a
month or two without seeing any spam.

You can do the same, and it won't take too long for you to get there once
you start making progress.

mvdB> Well, none of this is your concern of course. But I would really
mvdB> really really like if there was a way to have those typical english
mvdB> spam-words score way higher than they do now.

You've looked at the various rules. You know which rules frustrate you
the most. Create your own scoreset, and modify accordingly.

mvdB> Could we maybe envision two rulesets, one for english-speaking
mvdB> residents and one for non-english speaking residents...?

See http://www.exit0.us/index.php/GermanRules and
http://www.exit0.us/index.php/BrazilianRules -- join the party.

mvdB> In other words, a lot of us get bitten by the fact that "mortgage"
in some
mvdB> countries, in some contexts can be non-spam but for the rest of us
it is a 
mvdB> surefire sign to be spam.  ...

Perhaps you can work together with Wolfram and the others, and compose a
reasonable alternative scoreset for non-English-language destinations?

Bob Menschel

-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0

iQA/AwUBP6xns5ebK8E4qh1HEQLsswCgwyHlAezFn6/cqmTDTgpVb9nYowkAnj2J
Au48DZyNf+WbDOQ7SIW/9kM3
=rV3i
-----END PGP SIGNATURE-----




-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to