On Fri, 2003-08-15 at 18:19, Justin Mason wrote:
> Yorkshire Dave writes:
> > My original intention was to write an eval to run through the range of
> > caesar ciphers and import a list of substitution cipher codes, but it's 
> > too slow (probably because I write very poor perl), so here's the next
> > best thing.
> > 
> > I've thrown together a little CGI which will take an email address as
> > input and return a series of 24 SA rules which detect 30 different
> > listwashing tokens.
> 
> > If anyone's interested, my part-complete document on listwashing tokens
> > is at  http://www.wot.no-ip.com/show.me/Projects/Listwashing_Tokens/ and
> > the rule generator itself is http://www.wot.no-ip.com/cgi-bin/detoken.pl
> 
> Excellent analysis!  Also we're pretty sure figuring out some way to
> catch these inside SpamAssassin, automatically (ie. without the prior
> rule-building) would be very nifty.
> 
I'm already working along those lines, trying to do a good job of
automating a full set of 26 caesar cipher checks, and trying to write
something to detect a type of cipher I keep seeing from multiple
spammers which changes often but each is trivial to decode because the
method is the same, I'm calling it caesar with case/block substitution. 

I already have something working which detects all standard caesars and
a couple of monoalphabetic substitutions, it's on field test on my
busiest clients server right now, and its driving the load thru the
roof. Part of the reason for making that cgi was that I wanted to make a
ruleset for each user so I could turn it back off before their server
melts. The other part was that I can see it taking me a while to get it
right and people could be benefiting from those rules right now.

> One thing though -- many SpamAssassin users won't have only 1 address
> behind the scanner, so doing it beforehand based on the addr will limit it
> a bit.

Yes I know, the ability to do domain only is already half built. When
it's finished, hopefully tonight or tomorrow night, it will divide the
rules into 3 types, username domainname and full, I think that covers
every requirement. It will also print any rules for very short matches
commented out. Most people with a 2 or 3 letter username won't be
wanting to use them, those that do can uncomment after they review.
>    
> 
> We (Dan and I) were thinking that picking up the envelope-to and/or To:
> addresses, and permuting those, would probably work pretty well to do
> that.
> 
> (However, scanning for the domain part of an address would probably work
> pretty well, and I notice you're picking that up.)

With the common ciphers on the page, domain only works fine, but running
thru a full series of caesar on domain name alone may not be entirely
safe. I don't see how to produce a solution that will not need \b at
both ends. will allow any - or . to be .{1,3} whilst not making innocent
midword matches on another one of the 25 permutations for many perfectly
normal words. That alone is something that needs study, I would need to
write something to repeatedly rot+1 the entire dictionary and compare
back for matches before I could even begin to guess a minimum length.
> 
> BTW quick bug report: entering my mail addr, unticking the "username" box,
> and hitting Build Rules results in a few rules like this:
> 
>       rawbody W_ROT_2_L               //i
> 
> note that the empty pattern will hit every msg ;)

The username domainname full checkboxes are supposed to be commented out
until I make rule_print do what it needs to do. Because I didn't see the
problem of having to rewrite rule_print until after I'd started building
the function in, I just commented the checkboxes out for now. It seems I
missed, so it's making rules with '' in them, dutifully turning '' into
rot-whatever, substituting letters which don't exist and printing it
out. oops my bad. It did say (not implemented yet) beside the checkbox
though :)

> 
> --j.
-- 
Yorkshire Dave


-- 
Scanned by MailScanner at wot.no-ip.com



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to