As the debian archive grows larger, it gets increasingly laborious and time
consuming to keep my local debian archive up-to-date.  Here are my latest
scripts for automating the process (including some remaining manual steps).

I'm sure there are better ways to do it, which is one of my reasons for
posting them here.  In particular, I am interested in exploring the use the
debmirror to replace rsync in these scripts, although I'm not familiar enough
with it yet to know how well that might work.

(Note: these scripts work for x86 debian archives, and will need to be modified
accordingly for other architectures.  In addition, there are probably more 
elegant
ways to do these tasks, by consolidating them instead of using several small 
scripts.
Any proposals are welcome.)

For planning purposes, here is the disk space used by my debian archives as of
May 12, 2005:

indio:/mnt/install# du -sc debian debian-security debian-non-US debian-marillat
37373196        debian
1907560 debian-security
267864  debian-non-US
1277720 debian-marillat
40826340        total

I have a script called debian-all which rsyncs all the debian archives into a
holding archive at /mnt/install/debian[-*] (*=blank (main), security, non-US, or
marillat)

contents of /mnt/install/test/debian-all:
#!/bin/sh
LOOP=1
while [ "$LOOP" = 1 ]
do
   if rsync -vaHD --numeric-ids  --delete --delete-excluded --exclude '*ia64*' 
--exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' 
--exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude 
'*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude 
'*hurd*' --exclude '*UploadQueue*' rsync://ftp.debian.org/debian/ 
/mnt/install/debian/

   then LOOP=0
   else
    echo rsync error: trying debian main again
    sleep 10
   fi
done

echo
echo

#do debian main again to fill in any hardlink targets missed
#the first time 'round

LOOP=1
while [ "$LOOP" = 1 ]
do
   if rsync -vaHD --numeric-ids  --delete --delete-excluded --exclude '*ia64*' 
--exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' 
--exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude 
'*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude 
'*hurd*' --exclude '*UploadQueue*' rsync://ftp.debian.org/debian/ 
/mnt/install/debian/

   then LOOP=0
   else
    echo rsync error: trying debian main again
    sleep 10
   fi
done

echo
echo

LOOP=1
while [ "$LOOP" = 1 ]
do

   if rsync -vaHD --numeric-ids  --delete --delete-excluded --exclude '*ia64*' 
--exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' 
--exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude 
'*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude 
'*hurd*' --exclude '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' 
--exclude 'slink' rsync://non-us.debian.org/debian-non-US/ 
/mnt/install/debian-non-US/
   then LOOP=0
   else
    echo rsync error: trying debian-non-US again
    sleep 10
   fi
done

echo
echo

LOOP=1
while [ "$LOOP" = 1 ]
do

   if rsync -vaHD --numeric-ids  --delete --delete-excluded --exclude '*ia64*' 
--exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' 
--exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude 
'*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude 
'*hurd*' --exclude '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' 
--exclude 'slink' rsync://security.debian.org/debian-security/ 
/mnt/install/debian-security/
   then LOOP=0
   else
    echo rsync error: trying debian-security again
    sleep 10
   fi
done

echo
echo

#rsync debian-security again, this time with checksums (-c option), because 
there is
#no indices file there containing md5sums to check file integrity with after 
the fact.
#Note: update with checksums only after first updating without them, because
#this server has a strong tendency to give time out errors during large 
transfers

LOOP=1
while [ "$LOOP" = 1 ]
do

   if rsync -vcaHD --numeric-ids  --delete --delete-excluded --exclude '*ia64*' 
--exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' 
--exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude 
'*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude 
'*hurd*' --exclude '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' 
--exclude 'slink' rsync://security.debian.org/debian-security/ 
/mnt/install/debian-security/
   then LOOP=0
   else
    echo "rsync error: trying debian-security (w/csums) again"
    sleep 10
   fi
done

echo
echo

wget -nv -r ftp://ftp.nerim.net/debian-marillat/ -nH -N -P /mnt/install

#end of debian-all

(Note: At first I thought that the "if rsync .." statements should be "if ! rsync 
..." instead,
but that logic doesn't seem to work, for reasons that are unclear to me.)

-----------------------------------------------------------------------------------------

I set up my holding archive server to boot up by RTC alarm at 6:10am.  This 
allows
enough time to fsck any disks prior to the cron.daily wake-up time at 6:25am.  
In the
directory /etc/cron.daily I placed a script named udpate-debian.

contents of /etc/cron.daily/update-debian:
#!/bin/sh
LOGFILE=/var/tmp/update-debian.log
/mnt/install/test/debian-all >$LOGFILE 2>&1
echo >>$LOGFILE
echo update-debian cron script >>$LOGFILE
echo finished with archive update at `date` >>$LOGFILE
mail -s "update-debian.log for `date`" [EMAIL PROTECTED] 
</var/tmp/update-debian.log
while true
do
        #if updatedb is still running, wait for it to finish before shutting 
down
        pidof updateb && echo waiting for updatedb to finish at `date` 
>>$LOGFILE || shutdown -h now
        echo >>$LOGFILE
        sleep 300
done

--------------------------------------------------------------------------------------

The remaining steps could also be automated, but for now I prefer to do them 
manually for now.

In order to simplify checking with debsums, I keep a directory named 
/mnt/install/deblinks,
containing hard links to all the .deb files in the local archive.  To update 
this directory,
I first do "rm /mnt/install/deblinks.old;mv /mnt/install/deblinks 
/mnt/install/deblinks.old;mkdir /mnt/install/deblinks".

Now I am ready to put new .deb hardlinks in the /mnt/install/deblinks 
directory, using
the following scripts:

contents of /mnt/install/test/make-deblinks:
#!/bin/sh
find /mnt/install/debian* -regex .*\\.deb$ | /mnt/install/test/deblink-loop

contents of mnt/install/test/deblink-loop:
#!/bin/sh
cd /mnt/install/deblinks
while read filepath
do
        file=`echo $filepath | sed 's/.*\///'`
        ln $filepath $file
done
------------------------------------------------------------------------------------

Next I check for duplicate .deb files with differing md5 checksums using the
following scripts:

contents of /mnt/install/test/diff-dupes:
#!/bin/sh
/mnt/install/test/deb-dupes | /mnt/install/test/check-dupes

contents of /mnt/install/test/deb-dupes:
#!/bin/sh
find /mnt/install/debian* -regex .*\\.deb$ -printf %f\\n |sort|uniq -d

contents of /mnt/install/test/check-dupes:
#!/bin/sh
while read file
do
        find /mnt/install/debian*  -name $file |xargs md5sum|sort|uniq -uW1
done

----------------------------------------------------------------------------------

The output of diff-dupes is a handful of .debs with their respective md5 
checksums, which
for some unknown reason have multiple versions in the archives.  To be on the 
safe side,
I check to make sure none of these packages is installed on any of my systems.  
If I
find one installed, I remove it immediately, assuming it's either a trojan or 
corrupted package.

Once my holding archives are updated, I rsync the archives to another system 
serving as my working
debian archive server, and this is the server most often used by local systems 
both to update their
packages and run debsums against.  The purpose of the duplicate archive is to 
ensure that I have a
valid archive at all times even while one copy is being updated, and also in 
case one of the archive
drives fails.  (Even with DSL it would take many days to restore the lost 
debian archives.)

To update the working debian archive, I use the following script:

contents of script /mnt/install/test/copy-debian-archive:
#!/bin/sh
rsync -vaH --rsh=ssh --numeric-ids  --delete /mnt/install/debian/           
[EMAIL PROTECTED]:/mnt/install/debian/
rsync -vaH --rsh=ssh --numeric-ids  --delete /mnt/install/debian-non-US/    
[EMAIL PROTECTED]:/mnt/install/debian-non-US/
rsync -vacH --rsh=ssh --numeric-ids  --delete /mnt/install/debian-security/ 
[EMAIL PROTECTED]:/mnt/install/debian-security/
rsync -vacH --rsh=ssh --numeric-ids  --delete /mnt/install/debian-marillat/ 
[EMAIL PROTECTED]:/mnt/install/debian-marillat/

Notes: ibex is the hostname of my working debian archive server.  
debian-security and debian-marillat
are again transfered with checksumming, because they don't have indices 
(md5sum) files to check them
with after the fact.  I run this script twice to fill in any missing hardlink 
targets missed on the
first run.

The working debian archive server ibex also has copies of my scripts for making 
the .deb
hardlinks, which I run and then diff the hardlinks over nfs as a double check.  
At this
point I know that my two debian archive servers are identical and can be used 
interchangeably.

------------------------------------------------------------------------------------------

Before using the updated debian archives to update my local hosts, I check each 
.deb in the
working debian archives using the following scripts:

contents of script /mnt/install/test/check-debian-archives:
#!/bin/sh
./check-debian ../debian indices
./check-debian ../debian-non-US indices-non-US

contents of script /mnt/install/test/check-debian
#!/bin/sh
cat $1/$2/md5sums.gz |egrep -v 
'_arm|\-arm|\_alpha|\-alpha|hurd|powerpc|m68k|hppa|mips|mipsel|sparc|ia64|s390|potato|slink'|
 /mnt/install/test/md5chk $1

contents of script /mnt/install/test/md5chk:
cd $1
while read md5 filep
do
        #echo "$md5  $filep"
        filepath=`echo "$filep" | sed 's/"//g'`

        if [ -h "$filepath" ]
        then
                testmd5="00000000000000000000000000000000"
        else
        if [ -f "$filepath" ]
        then
                testmd5=`md5sum "$filepath" | (read md5 filepath; echo $md5;)`
        else
        if [ -e "$filepath" ]
        then
                testmd5="00000000000000000000000000000001"
        else
                echo "$filepath" not found | tee >&2
                echo
                continue
        fi
        fi
        fi

        if [ "$testmd5" != "$md5" ]
        then
                echo "$filepath md5sums don't match" | tee >&2
                echo "orig md5sum= $md5" | tee >&2
                echo "test md5sum= $testmd5" | tee >&2
                echo
        fi
        #echo "$testmd5  $filepath found"
done


Notes: In this script I assign arbitrary md5sums of 0 or 1 respectively to non-standard files or directories, which I use with other scripts to generate and test my own md5sum files on various archives. Since only debian main and debian-non-US have md5 indices files, they are the only archives checked here. The others have been transfered with checksumming enabled, as noted above.

------------------------------------------------------------------------------------------

Next I update each of my local hosts over nfs, with "apt-get update" following 
by the update option
of dselect using apt (not sure if that's necessary), followed by the select and 
install options of
dselect to install or update any new packages.  (I could use aptitude but I 
don't trust it yet.)  Finally
I run a debsums script on all of the local hosts to validate each file of any 
installed packages.

Each of my local systems has a copy of the following script to run debsums 
against a local debian archive:

contents of script check-debsums:
#!/bin/sh
debsums -ca --generate=all --deb-path=/mnt/$1/install/deblinks

The single argument is the name of one of my debian archive servers, either 
indio or ibex (holding or
working archives, respectively.) with its debian archives mounted via nfs.  
This script regenerates
the md5sums "on the fly" using the newly checked and updated debian archives, 
which increases my
assurance of installed package integrity.

By running these scripts once or twice per week, I not only keep my systems 
up-to-date, but minimize
the chance of a corrupted package remaining undetected on my systems.  The only 
weak security link
I can see is if someone were to trojan my debsums perl script.  If I were more 
security conscious
I could periodically boot a rescue floppy on each of my hosts and manually 
verify the md5sum of
the debsums script.  Any comments or suggestions are welcome.




--
To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]




Reply via email to