There are two partial emails shown below. They are both the same email from various sources. You can see that one uses decimal and the other hex. There are some punctuation characters (% %) being the "%", but they are few.
I like the idea of cleaning things up with the new format /\&\#0*(?:65|97);/, but this does not account for the hex values or the preceding "x" character to indicate same. The "x" comes before the zeros which is tripping me up at the moment. I'll noodle on this part and post if I come up with anything. Below are snippets of two different emails, from two senders. The same email encoded differently... Decimal encoded ------------------------------------------ Its The Most Advanced Pnis Enlargment Solution!<br> It's 100.% Guaranted To Enlarg Your Pnis</big><br> <br>- No Pills Or Capsules<br> - No Lotions Or Cremes<br> - No Pumps, Weights, Or Exercises<br> - No Prescription Necessary<br> - Doctor Designed & Endorsed<br> Hex encoded ------------------------------------------ Its The Most Advanced Pnis Enlargment Solution!<br> It's 100.% Guaranted To Enlarg Your Pnis</big><br> <br>- No Pills Or Capsules<br> - No Lotions Or Cremes<br> - No Pumps, Weights, Or Exercises<br> - No Prescription Necessary<br> - Doctor Designed & Endorsed<br> -----Original Message----- From: jennifer [mailto:[EMAIL PROTECTED] Sent: Monday, November 03, 2003 10:18 AM To: 'Scott Sprunger'; [EMAIL PROTECTED] Subject: RE: [SAtalk] [RD] Weeds changes Hi Scott, Thanks for the heads up. You wouldn't happen to have a sample of one of those spams would you? I'm curious about something. I'm wondering if they were using decimal code for punctuation rather than hex code for letters?? "=" (or =) is actually "=" not "a". So maybe you were seeing punctuation mixed in? "! being "!". If that is the case, we just need to tag on all the punctuation. However, I didn't know about the zeros, and you're right. Thanks! here is a cleaner way to write these, (thanks to a very nice person for pointing that out, A.L.!) /\&\#(?:65|97);/ so adding the zeros it would be /\&\#0*(?:65|97);/ I'll make this change on the page, but I'll wait a bit to see if I'm 'out in left' with my thinking. I'm realizing spammers are indirectly helping me out in my education, maybe I should say "Thank You!" to them as well. ...nah. Jennifer > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On > Behalf Of Scott Sprunger > Sent: Monday, November 03, 2003 8:51 AM > To: [EMAIL PROTECTED] > Subject: [SAtalk] [RD] Weeds changes > > > This past weekend a flood of new spam arrived which > circumvented the weeds rules by using leading zeros and hex > values (both legal from an HTML perspective). I've updated > my local rules as below. Hope this is useful. BTW, Jennifer > thanks for an incredible set of rules! > > -- Scott > > describe J_WEEDS_A Decimal or Hex character encoding [Aa] > full J_WEEDS_A > /(\&\#0*65\;|\&\#0*97\;|\&\#x0*41;|\&\#x0*61;)/i > score J_WEEDS_A 0.5 > > describe J_WEEDS_B Decimal or Hex character encoding [Bb] > full J_WEEDS_B > /(\&\#0*66\;|\&\#0*98\;|\&\#x0*42;|\&\#x0*62;)/i > score J_WEEDS_B 0.5 > > describe J_WEEDS_C Decimal or Hex character encoding [Cc] > full J_WEEDS_C > /(\&\#0*67\;|\&\#0*99\;|\&\#x0*43;|\&\#x0*63;)/i > score J_WEEDS_C 0.5 > > describe J_WEEDS_D Decimal or Hex character encoding [Dd] > full J_WEEDS_D > /(\&\#0*68\;|\&\#0*100\;|\&\#x0*44;|\&\#x0*64;)/i > score J_WEEDS_D 0.5 > > describe J_WEEDS_E Decimal or Hex character encoding [Ee] > full J_WEEDS_E > /(\&\#0*69\;|\&\#0*101\;|\&\#x0*45;|\&\#x0*65;)/i > score J_WEEDS_E 0.5 > > describe J_WEEDS_F Decimal or Hex character encoding [Ff] > full J_WEEDS_F > /(\&\#0*70\;|\&\#0*102\;|\&\#x0*46;|\&\#x0*66;)/i > score J_WEEDS_F 0.5 > > describe J_WEEDS_G Decimal or Hex character encoding [Gg] > full J_WEEDS_G > /(\&\#0*71\;|\&\#0*103\;|\&\#x0*47;|\&\#x0*67;)/i > score J_WEEDS_G 0.5 > > describe J_WEEDS_H Decimal or Hex character encoding [Hh] > full J_WEEDS_H > /(\&\#0*72\;|\&\#0*104\;|\&\#x0*48;|\&\#x0*68;)/i > score J_WEEDS_H 0.5 > > describe J_WEEDS_I Decimal or Hex character encoding [Ii] > full J_WEEDS_I > /(\&\#0*73\;|\&\#0*105\;|\&\#x0*49;|\&\#x0*69;)/i > score J_WEEDS_I 0.5 > > describe J_WEEDS_J Decimal or Hex character encoding [Jj] > full J_WEEDS_J > /(\&\#0*74\;|\&\#0*106\;|\&\#x0*4A;|\&\#x0*6A;)/i > score J_WEEDS_J 0.5 > > describe J_WEEDS_K Decimal or Hex character encoding [Kk] > full J_WEEDS_K > /(\&\#0*75\;|\&\#0*107\;|\&\#x0*4B;|\&\#x0*6B;)/i > score J_WEEDS_K 0.5 > > describe J_WEEDS_L Decimal or Hex character encoding [Ll] > full J_WEEDS_L > /(\&\#0*76\;|\&\#0*108\;|\&\#x0*4C;|\&\#x0*6C;)/i > score J_WEEDS_L 0.5 > > describe J_WEEDS_M Decimal or Hex character encoding [Mm] > full J_WEEDS_M > /(\&\#0*77\;|\&\#0*109\;|\&\#x0*4D;|\&\#x0*6D;)/i > score J_WEEDS_M 0.5 > > describe J_WEEDS_N Decimal or Hex character encoding [Nn] > full J_WEEDS_N > /(\&\#0*78\;|\&\#0*110\;|\&\#x0*4E;|\&\#x0*6E;)/i > score J_WEEDS_N 0.5 > > describe J_WEEDS_O Decimal or Hex character encoding [Oo] > full J_WEEDS_O > /(\&\#0*79\;|\&\#0*111\;|\&\#x0*4F;|\&\#x0*6F;)/i > score J_WEEDS_O 0.5 > > describe J_WEEDS_P Decimal or Hex character encoding [Pp] > full J_WEEDS_P > /(\&\#0*80\;|\&\#0*112\;|\&\#x0*50;|\&\#x0*70;)/i > score J_WEEDS_P 0.5 > > describe J_WEEDS_Q Decimal or Hex character encoding [Qq] > full J_WEEDS_Q > /(\&\#0*81\;|\&\#0*113\;|\&\#x0*51;|\&\#x0*71;)/i > score J_WEEDS_Q 0.5 > > describe J_WEEDS_R Decimal or Hex character encoding [Rr] > full J_WEEDS_R > /(\&\#0*82\;|\&\#0*114\;|\&\#x0*52;|\&\#x0*72;)/i > score J_WEEDS_R 0.5 > > describe J_WEEDS_S Decimal or Hex character encoding [Ss] > full J_WEEDS_S > /(\&\#0*83\;|\&\#0*115\;|\&\#x0*53;|\&\#x0*73;)/i > score J_WEEDS_S 0.5 > > describe J_WEEDS_T Decimal or Hex character encoding [Tt] > full J_WEEDS_T > /(\&\#0*84\;|\&\#0*116\;|\&\#x0*54;|\&\#x0*74;)/i > score J_WEEDS_T 0.5 > > describe J_WEEDS_U Decimal or Hex character encoding [Uu] > full J_WEEDS_U > /(\&\#0*85\;|\&\#0*117\;|\&\#x0*55;|\&\#x0*75;)/i > score J_WEEDS_U 0.5 > > describe J_WEEDS_V Decimal or Hex character encoding [Vv] > full J_WEEDS_V > /(\&\#0*86\;|\&\#0*118\;|\&\#x0*56;|\&\#x0*76;)/i > score J_WEEDS_V 0.5 > > describe J_WEEDS_W Decimal or Hex character encoding [Ww] > full J_WEEDS_W > /(\&\#0*87\;|\&\#0*119\;|\&\#x0*57;|\&\#x0*77;)/i > score J_WEEDS_W 0.5 > > describe J_WEEDS_X Decimal or Hex character encoding [Xx] > full J_WEEDS_X > /(\&\#0*88\;|\&\#0*120\;|\&\#x0*58;|\&\#x0*78;)/i > score J_WEEDS_X 0.5 > > describe J_WEEDS_Y Decimal or Hex character encoding [Yy] > full J_WEEDS_Y > /(\&\#0*89\;|\&\#0*121\;|\&\#x0*59;|\&\#x0*79;)/i > score J_WEEDS_Y 0.5 > > describe J_WEEDS_Z Decimal or Hex character encoding [Zz] > full J_WEEDS_Z > /(\&\#0*90\;|\&\#0*122\;|\&\#x0*5A;|\&\#x0*7A;)/i > score J_WEEDS_Z 0.5 > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Spamassassin-talk mailing list [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk > ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk