Jason Baker <[EMAIL PROTECTED]> writes:
> My company is both in Korea and in Canada, so we tend to get a lot of
> collateral spam from Korean spamhouses AND legitimate mail.
>
> One point I haven't seen yet in the ruleset is that there's a law in
> Korea that UCE (or perhaps even UBE) must have a subject header
> denoting it. I don't read/speak Korean, so I have no idea what
> exactly it is, but the characters are: 광고
>
> (hope that comes through)
>
> It may be a good basis for a very focused spam rule. I've seen it
> inside both () and [], but always at the front of the line.
Here are the strings I found more than once inside matching parens,
square/angle brackets, and braces. A "*" means zero or more of the
preceding character.
INDEX COUNT STRING NOTES
----- ----- ----------------------- -----------------------------
1 26 b1 a4 20* b0 ed the 20 is a space
2 3 c8 ab ba b8 variant of #5 ???
3 3 bc ba c0 ce b1 a4 b0 ed similar to #1
4 3 b1 a4 2e b0 ed similar to #1 (2e or '.' replaces 20)
5 2 c1 a4 ba b8 variant of #2 ???
As best I can tell, your string was "SPACE ea b4 91 ea b3 a0" which
bears zero resemblence to any of the above, so I hope yours got
corrupted on the way here. (Dude, don't send unquoted binary!)
Given that I can't display Korean, it's hard to know what means what.
String #2 and #5 look like they could be related strings (c1 + 7 = c8,
a4 + 7 = ab). These could all be variations in capitalization or
something like that. Here's my best attempt at a regular expression,
combining #1, #3, and #4.
Here's my first pass (lightly tested). Combination of strings #1, #3, and #4.
header KOREAN_UCE_SUBJECT Subject =~ /[({[<] *(\xbc\xba\xc0\xce)?\xb1\xa4(
*|\x2e)\xb0\xed *[)}\]>]/
describe KOREAN_UCE_SUBJECT Subject has Korean unsolicited email denotation
score KOREAN_UCE_SUBJECT 2.0
Is this right? I don't have that many Korean messages in my spam corpus
and none in my nonspam corpus, so someone must check the meaning of this
string before I check it in. I'd like to know more about the other two
strings as well.
Dan
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk