> This looks like the bunch mgm was talking about. I have *no* spam from > these guys, unfortunately.
I keep being surprised by the difference between different people's spam. They account for nearly 10% of my spam... > - catching the wierd "letters-numbers-letters-numbers.com" format they > use for their domains They're not too consistent, but a "domain name contains n-digit number" rule would catch most of them. Here are a sampling of their addresses for a couple of weeks' spam: easyclstroffrzdaily19876339.com e-clstroffrzdaily19876339.com easylistdpoffrs.com l129876-daliypromo.com thelst40090hspeedm.com webhsm2282jende119283000send.com webl129876-daliypromo.com teledailypromotionslist1090009.com findhsm-list-cluster-182-643.com gotospeedoffrslist873009118273.com > - the use of <x-html>, a non-std HTML tag. I only found a couple of these in my corpus. > - this message-id format: > > > Message-Id: <1pv4ec$[EMAIL PROTECTED]> I don't know much about 'normal' message IDs but here are a few samples: <1pv003$[EMAIL PROTECTED]> <1ppsu4$[EMAIL PROTECTED]> <1pv402$[EMAIL PROTECTED]> <1ppevn$[EMAIL PROTECTED]> <1pq1uj$[EMAIL PROTECTED]> > Anyone getting these care to make some rules? I'll see what I can come up with. A few more consistent things in their spam, some of which the rules I previously posted cover: 1. URLs of this format: http://pxe.x.com/logic/xx.pl?x=xxxxxxxxx The 'pxe' and the '/logic/' are pretty consistent, not much else is. 2. The X-Mailer-Version header. Its value is always 'v' followed by a number. Nobody else seems to use this in my tests. X-Mailer-Version: v 202057756 3. The text "To discontinue the receipt of emails, visit the following link" (they'll change this, of course.) 4. For possible use in meta rules, their messages always match CTYPE_JUST_HTML and WEB_BUGS, usually SUPERLONG_LINE and JAVASCRIPT, and not much else. I ran one of these through 2.50CVS and it does a bit better with the new HTML percentage test, among other things: SPAM: ---- Start SpamAssassin results SPAM: 8.40 hits, 5 required; SPAM: * 1.8 -- BODY: Message is 90-100% HTML tags SPAM: * 1.2 -- BODY: Javascript to open a new window SPAM: * 1.0 -- BODY: HTML has unbalanced "html" tags SPAM: * 0.8 -- BODY: Javascript to move windows around SPAM: * 0.2 -- BODY: Image tag with an ID code to identify you SPAM: * 0.2 -- BODY: JavaScript code SPAM: * 0.0 -- BODY: T_HTML_P2_90_100 SPAM: * 0.0 -- BODY: T_HTML_IMAGE_AREA01 SPAM: * 0.0 -- BODY: T_HTML_MESSAGE SPAM: * 0.0 -- BODY: T_HTML_P1_80_100 SPAM: * 0.0 -- BODY: T_HTML_TAG_EXISTS_CENTER SPAM: * 0.0 -- BODY: T_HTML_SHOUTING1 SPAM: * 0.0 -- BODY: T_HTML_WIN_FOCUS SPAM: * 0.0 -- BODY: T_HTML_CONSEC_IMGS04 SPAM: * 0.0 -- BODY: T_HTML_NUM_IMGS07 SPAM: * 2.7 -- Listed in DCC, see http://rhyolite.com/anti-spam/dcc/ SPAM: * 0.5 -- HTML-only mail, with no text version SPAM: SPAM: ---- End of SpamAssassin results -- Michael Moncur mgm at starlingtech.com http://www.starlingtech.com/ "Only the shallow know themselves." --Oscar Wilde ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk