>BTW, are there any german rules? How could I help out writing some?

I don't know if there are any official rules.  It looks like German-language
spam is becoming more common, so having some rules would be a good thing.

Can you help?  Sure, if you are willing to.  There are several possible
ways.  SA is of course a volunteer project like all open source stuff.  The
SA Devs are working on setting up a rules project, although I don't think
they have it quite ready yet.

SARE is more than happy to accept contributions, and I'm sure we would be
quite happy to add a 70_german.cf or some such with language-specific rules.

Also, you can write rules and either post them here, or contribute them in
an attachment to a BZ enhancement ticket against SA Rules.  Or you could
send them directly to either me or Bob or some of the other SARE folk and we
would see that they get collected.

Now, the one problem that both SARE and the SA project have with German (or
any non-English rules) is that we need a way to test rules and see if they
are effective.  "Effective" means two things here: they hit a reasonable
amount of spam, and they don't hit very much ham.  The way you do this is
with a mass-check.

But to do a useful mass-check, you need a corpus of hand-verified ham, and
of hand-verified spam.  Lots of have this sort of thing for English rules,
because we get English-language mail and spam.  But very few of us in either
SARE or the SA devs get any quantity of valid foreign-language mail of any
sort.

So another thing you could do, or anyone else that speaks German could do,
would be to build up a corpus of verified ham and spam that can be used to
check rules.

But simply making some relatively good rules would be a good start, and
isn't that hard to do if you have your own SA setup to play with.  The only
real 'trick' is to be able to look at a message and make a fairly good guess
about what phrases or constuctions will show up in spam but not in ham.
Then you write a rule for that, add it to local.cf (or user_prefs, if
running per-user rules), make sure the rule really hits on a test spam, and
then watch results for a day or so to see if it is working like it should.

        Loren

Reply via email to