On Fri, 2003-08-22 at 22:36, Chris Santerre wrote:
Ok, I grabbed the script I wrote. Quick and dirty but seems to save a LOT of
time. The idea is simple.

Basically grep your spamtrap for all lines that have 'http://' in them. You
lose a small percent because of line breaks but they repeat so much it
doesn't matter.

Next you just strip away all the garbage until you are left with either an
IP or a FQDN. This little script will do about 80% of the work. But you
still need to clean by hand, but it takes minutes after this script. 

Evilrules script: (Yes I named the file evilrules :P, the first 'rm -f evil'
line cleans up the file from the previous run.)

usage: evilrules spamtrapfile
--copy here--
#!/bin/sh
rm -f evil
cat $1 | grep 'http://' > evil
sed -e 's/^.*http\:\/\///i' < evil | cat >> evil2
rm -f evil
sed -e 's/[\/].*$//i' < evil2 | cat >> evil
rm -f evil2
sed -e 's/^.*@//i' < evil | cat >> evil2
rm -f evil
sed -e 's/=.*$//i' < evil2 | cat >> evil
rm -f evil2
sed -e 's/".*$//i' < evil | cat >> evil2
rm -f evil
sed -e 's/^.*\#$//i' < evil2 | cat >> evil
rm -f evil2
sort evil > evil2
rm -f evil
uniq evil2 > evil
rm -f evil2
echo Please edit by hand the 'evil' file.
echo then run 'reg2rule.pl -b evil > somefile.cf' 
--paste here--

That will clean a lot. However for some reason I couldn't get it to strip
the '&','%', and '#' correctly. the regex seems a little different. However
these are usually obfuscated urls, so it could be nice to leave in, BUT I
take them out as Dave's code in reg2rule.pl will not properly escape them.

It has to be gone over by hand anyway. Might be FPs in there, and you also
see a few incomplete repeats:
walmart.co
walmart.c
ect....
Just a limitation on uniq, because the FQDNs range in length. 

Then that's it. Like the echo said, just run 'reg2rule.pl -b evil >
somefile.cf' after cleaning the evil file and BLAMO! It took you 5 minutes
to generate tons of evil domain rules. 

You could also use other otions as well. I use 'reg2rule.pl -b -dEvil_date
-s1.5 evil > EvilDATE.cf' , where date is the date I ran it on. So I know
the last time I did it. 

Again http://www.wot.no-ip.com/Projects/Blocklist/reg2rule.pl
it uses STDIN and STDOUT.  run reg2rule.pl -h for usage

(Note: the version I have says the default score is 1.0, but it defaults to
0.5, I may have a beta version. But simple to change that code. )

I can't thank Yorkshie Dave enough for writing this script. Saves a TON of
time and hits like a rabid pitbull. No FPs unless you were alseep when you
went over the evil file. Thia is a real winner in my book!

Chris Santerre 
System Admin and SA Custom Rules Emporium keeper 
http://www.merchantsoverseas.com/wwwroot/gorilla/sa_rules.htm
"A little nonsense now and then, is relished by the wisest men." - Willy
Wonka 

I am using spamassassin on the gateway mailserver, using MailScanner. All mails found to be spam are marked spam and the software really works fine.  ( using redhat 7.2 + sendmail + mailscanner + spamd )
   My question is where Do I get this spamtrap , forgive me If you find my question too naive


  Since I control the server where the mails are intitally received I can just block these spammers at the gate.  I will get a list of all domains in the mail from , and create an Obviously spam domain list and then block these domains, If I am getting too many mails from these servers.  This way I avoid all the trouble of receiving the mail on my local server and then scanning them

Thanks
Ram




Reply via email to