Noel Jones wrote:
> At 03:17 AM 5/8/2007, Alexander Grüner wrote:
>> Ralf,
>>
>> I wrote a small script by myself - very simpel. It seems to work now for
>> months.
>>
>> #!/bin/sh
>> cd /tmp
>> # Unofficial Phising rules for ClamAV
>> wget -nd -m http://ftp.tiscali.nl/sanesecurity/phish.ndb.gz
>> wget -nd -m http://ftp.tiscali.nl/sanesecurity/scam.ndb.gz
>> cp phish.ndb.gz /var/lib/clamav/
>> cp scam.ndb.gz /var/lib/clamav/
>> cd /var/lib/clamav
>> gunzip -f phish.ndb.gz
>> gunzip -f scam.ndb.gz
>> chown vscan:vscan phish.ndb
>> chown vscan:vscan scam.ndb
>> rcclamd restart
>>
>> Run by root via crontab.
> 
> The above is not particularly safe.
> For add-on signatures, it's important to test the signatures with 
> "clamscan -d phish.ndb /some/small/file" *before* copying them into 
> the "live" clam signature directory.
> Freshclam does this for you for the official signatures.  If the 
> add-on file is corrupted or just the wrong format for some reason, 
> copying it into the clam signature directory will cause clam to refuse to run.
> 
> A safe (but not very clever) script would look like:
> #!/bin/sh
> cd /tmp
> wget -nd -m http://ftp.tiscali.nl/sanesecurity/phish.ndb.gz && \

A kinder, gentler usage of wget would include -N to prevent getting the 
same file over and over thereby burning off Steve's bandwidth. Just be 
sure to preserve the name and time stamp of the gz files when you ungzip 
them. Use:
gunzip -c phish.ndb phish.ndb.gz
or
gunzip < phish.ndb.gz >phish.ndb

This leaves the previously downloaded file in place so wget can 
reference it when polling Steve's server for a newer version.

Using a list of URL's with --input-file=FILE would grab both files in a 
single connection and also help cut down on resource usage.

As mentioned elsewhere, use a randomizer when running these scripts so 
you don't dogpile on the server at 0,10,20 etc. minutes past the hour:

Using bash for the shell allows:

if [ -z "$1" ]; then
   sleep $[ RANDOM % 600 ]
fi

This causes the process to sleep for some random number of seconds up to 
600 before completing. This also helps prevent excessive peak loads on 
Steve's server. The test allows bypassing this delay if needed as in:

sanesecurity.sh now

This causes an immediate download.

When dropping the files into the final working dir you'd like to make 
that as atomic as possible, so use mv (if everything is on one 
partition) or cp with mv (cp phish.ndb /path/phish.tmp;mv 
/path/phish.tmp /path/phish.ndb, or rsync phish.ndb /path. This keeps 
clamd from ever seeing a file while it is being copied to the working 
directory. clamd is very picky about that.

Each mirror has it's own unique URL list for the files so you can create 
three files:

sane.list:
http://www.sanesecurity.co.uk/clamav/phish.ndb.gz
http://www.sanesecurity.co.uk/clamav/scam.ndb.gz

dotsrc.list:
http://mirrors.dotsrc.org/clamav-sanesigs/phish.ndb.gz
http://mirrors.dotsrc.org/clamav-sanesigs/scam.ndb.gz

tiscali.list:
http://ftp.tiscali.nl/sanesecurity/phish.ndb.gz
http://ftp.tiscali.nl/sanesecurity/scam.ndb.gz

#Sane Security Files
SaneFileList="/usr/local/share/clamav/tmp/sane.list"
DotsrcFileList="/usr/local/share/clamav/tmp/dotsrc.list"
TiscaliFileList="/usr/local/share/clamav/tmp/tiscali.list"

And since we don't want to slam only Steve's server we'd like some kind 
of load sharing, so we randomize which list we use anothe randomizer.

Put it all together and you have

#!/bin/bash

if [ -z "$1" ]; then
   sleep $[ RANDOM % 600 ]
fi

# randomization for selecting server
let "RoundRobin = $RANDOM % 3 + 1"

case "$RoundRobin" in
1)
         WgetList=$SaneFileList
         ;;
2)
         WgetList=$DotsrcFileList
         ;;
3)
         WgetList=$TiscaliFileList
         ;;
esac


cd /var/tmp # /tmp contents don't survive a reboot
wget -q -N --input-file=$WgetList  && \
gunzip < phish.ndb.gz > phish.ndb && \
clamscan -d phish.ndb phish.ndb && \
rsync phish.ndb /var/lib/clamav/ && \
chown vscan:vscan /var/lib/clamav/phish.ndb

gunzip < scam.ndb.gz > scam.ndb && \
clamscan -d scam.ndb scam.ndb && \
rsync scam.ndb /var/lib/clamav/ && \
chown vscan:vscan /var/lib/clamav/scam.ndb
rcclamd restart # hopefully this is a reload, not a restart

Now you have a cron-driven script that does not dogpile at frequently 
used cron intervals, that does not download files unless the source is 
newer, uses a random source to fetch files from, uses lists of files so 
only a single wget session is required, and tests the integrity of the 
files before doing an atomic copy to the working directory.

Be advised that some, possibly all versions of wget will return a 
success code even when it fails to download a file so everything that 
follows the wget will always take place even if a new file has not been 
downloaded. I've created an ugly hack to keep this from happening, but 
in truth it just burns a little CPU time to accept the process as is.

There's other ways to do this, of course.

dp

_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://lurker.clamav.net/list/clamav-users.html

Reply via email to