On Fri, 2003-08-15 at 18:19, Justin Mason wrote: > Yorkshire Dave writes: > > My original intention was to write an eval to run through the range of > > caesar ciphers and import a list of substitution cipher codes, but it's > > too slow (probably because I write very poor perl), so here's the next > > best thing. > > > > I've thrown together a little CGI which will take an email address as > > input and return a series of 24 SA rules which detect 30 different > > listwashing tokens. > > > If anyone's interested, my part-complete document on listwashing tokens > > is at http://www.wot.no-ip.com/show.me/Projects/Listwashing_Tokens/ and > > the rule generator itself is http://www.wot.no-ip.com/cgi-bin/detoken.pl > > Excellent analysis! Also we're pretty sure figuring out some way to > catch these inside SpamAssassin, automatically (ie. without the prior > rule-building) would be very nifty. > I'm already working along those lines, trying to do a good job of automating a full set of 26 caesar cipher checks, and trying to write something to detect a type of cipher I keep seeing from multiple spammers which changes often but each is trivial to decode because the method is the same, I'm calling it caesar with case/block substitution.
I already have something working which detects all standard caesars and a couple of monoalphabetic substitutions, it's on field test on my busiest clients server right now, and its driving the load thru the roof. Part of the reason for making that cgi was that I wanted to make a ruleset for each user so I could turn it back off before their server melts. The other part was that I can see it taking me a while to get it right and people could be benefiting from those rules right now. > One thing though -- many SpamAssassin users won't have only 1 address > behind the scanner, so doing it beforehand based on the addr will limit it > a bit. Yes I know, the ability to do domain only is already half built. When it's finished, hopefully tonight or tomorrow night, it will divide the rules into 3 types, username domainname and full, I think that covers every requirement. It will also print any rules for very short matches commented out. Most people with a 2 or 3 letter username won't be wanting to use them, those that do can uncomment after they review. > > > We (Dan and I) were thinking that picking up the envelope-to and/or To: > addresses, and permuting those, would probably work pretty well to do > that. > > (However, scanning for the domain part of an address would probably work > pretty well, and I notice you're picking that up.) With the common ciphers on the page, domain only works fine, but running thru a full series of caesar on domain name alone may not be entirely safe. I don't see how to produce a solution that will not need \b at both ends. will allow any - or . to be .{1,3} whilst not making innocent midword matches on another one of the 25 permutations for many perfectly normal words. That alone is something that needs study, I would need to write something to repeatedly rot+1 the entire dictionary and compare back for matches before I could even begin to guess a minimum length. > > BTW quick bug report: entering my mail addr, unticking the "username" box, > and hitting Build Rules results in a few rules like this: > > rawbody W_ROT_2_L //i > > note that the empty pattern will hit every msg ;) The username domainname full checkboxes are supposed to be commented out until I make rule_print do what it needs to do. Because I didn't see the problem of having to rewrite rule_print until after I'd started building the function in, I just commented the checkboxes out for now. It seems I missed, so it's making rules with '' in them, dutifully turning '' into rot-whatever, substituting letters which don't exist and printing it out. oops my bad. It did say (not implemented yet) beside the checkbox though :) > > --j. -- Yorkshire Dave -- Scanned by MailScanner at wot.no-ip.com ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk