maarten van den Berg said:
>
> Following up to myself, since I want to clarify something here...
>
> Another aspect that is relevant to me (but arguably not to most users of
> SA
> and I'm aware of that...) is that for me, english is not my native
> language,
> neither am I a resident of an english-speaking country. And because of
> this,
> my email is mixed; one part is dutch, one part is all the mailinglists I
> try
> to follow (which are in english).  But not being a resident, the fact is
> that
> for all of my customers and myself, ANY mail mentioning mortgages, loans,
> ejaculation et al is a surefire sign of spam.  If not it would have
> mentioned
> hypotheken, leningen and klaarkomen, which are the dutch translations. :-)
>
> Now I don't expect SA to know dutch; that would be unfair. But what I
> would
> like is some way to score those english terms way higher than an american
> would or could.  For an american, mortgage does not spell spam per se. But
> for ME it does, and I can practically guarantee I will not ever get an
> email
> that mentions "mortgage" together with "you have been approved" which
> won't
> be spam.

At the risk of being repetitive, this is precisely the sort of thing bayes
excels at.  Give it a shot (hopefully you have some ham'n'spam saved up
already), I think you will be pleased.



> Well, none of this is your concern of course. But I would really really

Perhaps it's true that your success is not directly anyone's concern but
your own.  However, the regulars on this list are basically a buncha SA
users who are trying to improve their results and help others do the same
along the way.

> really
> like if there was a way to have those typical english spam-words score way
> higher than they do now.  Could we maybe envision two rulesets, one for
> english-speaking residents and one for non-english speaking residents...?
> I edited the score file myself but not only is it a hard, long and
> error-prone
> task, but by editing it I throw away much of the valueable knowhow which
> assembled that score-list in the first place.  But I am faced with the
> fact
> that over 95% of my spam is in english and that I cannot sit back while
> the
> online pharmacies fly around me, so to speak.
> Put yourself in my (our, if i'd be speaking for all non-english countries)
> place and ask yourself this question: Would you accept a score of only 0.5
> for a rule that says "gratis hypotheekadvies" or "vijf miljoen
> emailadressen"
> ??  No, of course you wouldn't, because you'd know that a company that
> pretends to sell you a mortgage from 12000 miles away will never ever be a
> genuine offer...


Knowing that there are regulars on this list who's primary language is NOT
English, anyone care to share how their setup handles English and
non-English spam?



>
> In other words, a lot of us get bitten by the fact that "mortgage" in some
> countries, in some contexts can be non-spam but for the rest of us it is a
> surefire sign to be spam.  And again that is not anyone's fault but we
> should
> try and make SA flexible enough to accomodate this fact by changing the
> scoring.  I know you can teach SA to recognize spam in ones' own language,
> but what is missing right now is a simple way to make SA much more immune
> to
> the abundant english spam, which arguably is by FAR the bulk of all
> spam...
>
> Kind regards,
> Maarten
>
>
> On Friday 07 November 2003 22:21, maarten van den Berg wrote:
>> On Friday 07 November 2003 18:43, Matt Kettler wrote:
>> > At 10:29 AM 11/7/2003, Maarten J H van den Berg wrote:
>> > >Sorry if this has been discussed in the past...
>> >
>> > It's been discussed many times.. It's very common for people to have a
>> > very deep misunderstanding of how SA scoring works. Most people fall
>> into
>> > the trap of over-simplifying the problem, and simply assuming that
>> some
>> > rule or another "must" be a good spam rule, when in fact it's not.
>
> <snip>

--
Chris Thielen

Easily generate SpamAssassin rules to catch obfuscated spam phrases:
http://www.sandgnat.com/cmos/


-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to