RE: Rule for Russian character sets

Michael Hutchinson Thu, 14 Feb 2008 15:20:37 -0800

> -----Original Message-----
> > > We're suddenly getting a ton of spam with koi8-r encoding...I
tried to
> > > do a custom rule for it like this:
> > >
> > > header SUBJ_RUSS_CHAR           Subject =~/koi8-r/i
> > > describe SUBJ_RUSS_CHAR         has Russian char encoding
> > > score SUBJ_RUSS_CHAR            3.5
> > >
> > > The short headers for these spams look like this:
> > >
> > > Subject: [koi8-r] ??? ????
> > >
> > > The "raw" Subject header, like this:
> > >
> > > Subject: =?koi8-r?B?9/zkINDSxcTQ0snR1MnKINPFzcnOwdI=?=
> > >
> > > I would think the rule would catch it either way...what am I
missing?
> >
> > I think this should work:
> >
> > header SUBJ_RUSS_CHAR           Subject:raw =~ /koi8-r/i
> 
> That did it, thanks!
>


Are we not meant to delimit characters like a minus sign?

Ex:
header SUBJ_RUSS_CHAR                   Subject:raw =~ /koi8\-r/i

I would really like to trap the question marks too, just in case someone
sends a legitimate email with koi8-r in the subject (ie: "why does email
with the koi8-r character set get tagged as spam?)

In other words, the following rule (if it worked) would be nice to use
instead:

Ex:

Header SUBJ_RUSS_CHAR                   Subject:raw =~ /\=\?koi8\-r\?/

Where we could trap the Equals sign, and two question marks. I have not
employed this rule because I think its dodgy, the Regexp expander over
at SARE says there is a scary amount of matches (2000+) with that rule,
so I'm presuming that the matching for the equals character and the
question mark are not working properly, and will have to be delimited
some other way. For example, using the \x1B notation, but I've had no
luck with this.

Does anyone have suggestions for matching question marks and equals
signs in one line? I would like to match everything exactly between the
double quotes:

"=?koi8-r?"

If I were to read the perldoc docs I'd be using "\=\?koi8\-r\?"
But I don't want to test it on my live server, because of the output of
the Regex expander utility.

Help anyone?

Cheers,
Mike

RE: Rule for Russian character sets

Reply via email to