> This looks like the bunch mgm was talking about.  I have *no* spam from
> these guys, unfortunately.

I keep being surprised by the difference between different people's spam.
They account for nearly 10% of my spam...

>   - catching the wierd "letters-numbers-letters-numbers.com" format they
>     use for their domains

They're not too consistent, but a "domain name contains n-digit number" rule
would catch most of them. Here are a sampling of their addresses for a
couple of weeks' spam:

easyclstroffrzdaily19876339.com
e-clstroffrzdaily19876339.com
easylistdpoffrs.com
l129876-daliypromo.com
thelst40090hspeedm.com
webhsm2282jende119283000send.com
webl129876-daliypromo.com
teledailypromotionslist1090009.com
findhsm-list-cluster-182-643.com
gotospeedoffrslist873009118273.com

>   - the use of <x-html>, a non-std HTML tag.

I only found a couple of these in my corpus.

>   - this message-id format:
>
>   > Message-Id: <1pv4ec$[EMAIL PROTECTED]>

I don't know much about 'normal' message IDs but here are a few samples:
  <1pv003$[EMAIL PROTECTED]>
  <1ppsu4$[EMAIL PROTECTED]>
  <1pv402$[EMAIL PROTECTED]>
  <1ppevn$[EMAIL PROTECTED]>
  <1pq1uj$[EMAIL PROTECTED]>

> Anyone getting these care to make some rules?

I'll see what I can come up with. A few more consistent things in their
spam, some of which the rules I previously posted cover:

1. URLs of this format:
http://pxe.x.com/logic/xx.pl?x=xxxxxxxxx
The 'pxe' and the '/logic/' are pretty consistent, not much else is.

2. The X-Mailer-Version header. Its value is always 'v' followed by a
number. Nobody else seems to use this in my tests.
X-Mailer-Version: v 202057756

3. The text "To discontinue the receipt of emails, visit the following link"
(they'll change this, of course.)

4. For possible use in meta rules, their messages always match
CTYPE_JUST_HTML and WEB_BUGS, usually SUPERLONG_LINE and JAVASCRIPT, and not
much else.

I ran one of these through 2.50CVS and it does a bit better with the new
HTML percentage test, among other things:

SPAM: ---- Start SpamAssassin results
SPAM: 8.40 hits, 5 required;
SPAM: *  1.8 -- BODY: Message is 90-100% HTML tags
SPAM: *  1.2 -- BODY: Javascript to open a new window
SPAM: *  1.0 -- BODY: HTML has unbalanced "html" tags
SPAM: *  0.8 -- BODY: Javascript to move windows around
SPAM: *  0.2 -- BODY: Image tag with an ID code to identify you
SPAM: *  0.2 -- BODY: JavaScript code
SPAM: *  0.0 -- BODY: T_HTML_P2_90_100
SPAM: *  0.0 -- BODY: T_HTML_IMAGE_AREA01
SPAM: *  0.0 -- BODY: T_HTML_MESSAGE
SPAM: *  0.0 -- BODY: T_HTML_P1_80_100
SPAM: *  0.0 -- BODY: T_HTML_TAG_EXISTS_CENTER
SPAM: *  0.0 -- BODY: T_HTML_SHOUTING1
SPAM: *  0.0 -- BODY: T_HTML_WIN_FOCUS
SPAM: *  0.0 -- BODY: T_HTML_CONSEC_IMGS04
SPAM: *  0.0 -- BODY: T_HTML_NUM_IMGS07
SPAM: *  2.7 -- Listed in DCC, see http://rhyolite.com/anti-spam/dcc/
SPAM: *  0.5 -- HTML-only mail, with no text version
SPAM:
SPAM: ---- End of SpamAssassin results

--
Michael Moncur  mgm at starlingtech.com  http://www.starlingtech.com/
"Only the shallow know themselves." --Oscar Wilde



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to