As the debian archive grows larger, it gets increasingly laborious and time consuming to keep my local debian archive up-to-date. Here are my latest scripts for automating the process (including some remaining manual steps).
I'm sure there are better ways to do it, which is one of my reasons for posting them here. In particular, I am interested in exploring the use the debmirror to replace rsync in these scripts, although I'm not familiar enough with it yet to know how well that might work.
(Note: these scripts work for x86 debian archives, and will need to be modified accordingly for other architectures. In addition, there are probably more elegant ways to do these tasks, by consolidating them instead of using several small scripts. Any proposals are welcome.)
For planning purposes, here is the disk space used by my debian archives as of May 12, 2005:
indio:/mnt/install# du -sc debian debian-security debian-non-US debian-marillat 37373196 debian 1907560 debian-security 267864 debian-non-US 1277720 debian-marillat 40826340 total
I have a script called debian-all which rsyncs all the debian archives into a holding archive at /mnt/install/debian[-*] (*=blank (main), security, non-US, or marillat)
contents of /mnt/install/test/debian-all: #!/bin/sh LOOP=1 while [ "$LOOP" = 1 ] do if rsync -vaHD --numeric-ids --delete --delete-excluded --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude '*UploadQueue*' rsync://ftp.debian.org/debian/ /mnt/install/debian/
then LOOP=0 else echo rsync error: trying debian main again sleep 10 fi done
echo echo
#do debian main again to fill in any hardlink targets missed #the first time 'round
LOOP=1 while [ "$LOOP" = 1 ] do if rsync -vaHD --numeric-ids --delete --delete-excluded --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude '*UploadQueue*' rsync://ftp.debian.org/debian/ /mnt/install/debian/
then LOOP=0 else echo rsync error: trying debian main again sleep 10 fi done
echo echo
LOOP=1 while [ "$LOOP" = 1 ] do
if rsync -vaHD --numeric-ids --delete --delete-excluded --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' --exclude 'slink' rsync://non-us.debian.org/debian-non-US/ /mnt/install/debian-non-US/ then LOOP=0 else echo rsync error: trying debian-non-US again sleep 10 fi done
echo echo
LOOP=1 while [ "$LOOP" = 1 ] do
if rsync -vaHD --numeric-ids --delete --delete-excluded --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' --exclude 'slink' rsync://security.debian.org/debian-security/ /mnt/install/debian-security/ then LOOP=0 else echo rsync error: trying debian-security again sleep 10 fi done
echo echo
#rsync debian-security again, this time with checksums (-c option), because there is #no indices file there containing md5sums to check file integrity with after the fact. #Note: update with checksums only after first updating without them, because #this server has a strong tendency to give time out errors during large transfers
LOOP=1 while [ "$LOOP" = 1 ] do
if rsync -vcaHD --numeric-ids --delete --delete-excluded --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' --exclude 'slink' rsync://security.debian.org/debian-security/ /mnt/install/debian-security/ then LOOP=0 else echo "rsync error: trying debian-security (w/csums) again" sleep 10 fi done
echo echo
wget -nv -r ftp://ftp.nerim.net/debian-marillat/ -nH -N -P /mnt/install
#end of debian-all
(Note: At first I thought that the "if rsync .." statements should be "if ! rsync ..." instead, but that logic doesn't seem to work, for reasons that are unclear to me.)
-----------------------------------------------------------------------------------------
I set up my holding archive server to boot up by RTC alarm at 6:10am. This allows enough time to fsck any disks prior to the cron.daily wake-up time at 6:25am. In the directory /etc/cron.daily I placed a script named udpate-debian.
contents of /etc/cron.daily/update-debian: #!/bin/sh LOGFILE=/var/tmp/update-debian.log /mnt/install/test/debian-all >$LOGFILE 2>&1 echo >>$LOGFILE echo update-debian cron script >>$LOGFILE echo finished with archive update at `date` >>$LOGFILE mail -s "update-debian.log for `date`" [EMAIL PROTECTED] </var/tmp/update-debian.log while true do #if updatedb is still running, wait for it to finish before shutting down pidof updateb && echo waiting for updatedb to finish at `date` >>$LOGFILE || shutdown -h now echo >>$LOGFILE sleep 300 done
--------------------------------------------------------------------------------------
The remaining steps could also be automated, but for now I prefer to do them manually for now.
In order to simplify checking with debsums, I keep a directory named /mnt/install/deblinks, containing hard links to all the .deb files in the local archive. To update this directory, I first do "rm /mnt/install/deblinks.old;mv /mnt/install/deblinks /mnt/install/deblinks.old;mkdir /mnt/install/deblinks".
Now I am ready to put new .deb hardlinks in the /mnt/install/deblinks directory, using the following scripts:
contents of /mnt/install/test/make-deblinks: #!/bin/sh find /mnt/install/debian* -regex .*\\.deb$ | /mnt/install/test/deblink-loop
contents of mnt/install/test/deblink-loop: #!/bin/sh cd /mnt/install/deblinks while read filepath do file=`echo $filepath | sed 's/.*\///'` ln $filepath $file done ------------------------------------------------------------------------------------
Next I check for duplicate .deb files with differing md5 checksums using the following scripts:
contents of /mnt/install/test/diff-dupes: #!/bin/sh /mnt/install/test/deb-dupes | /mnt/install/test/check-dupes
contents of /mnt/install/test/deb-dupes: #!/bin/sh find /mnt/install/debian* -regex .*\\.deb$ -printf %f\\n |sort|uniq -d
contents of /mnt/install/test/check-dupes: #!/bin/sh while read file do find /mnt/install/debian* -name $file |xargs md5sum|sort|uniq -uW1 done
----------------------------------------------------------------------------------
The output of diff-dupes is a handful of .debs with their respective md5 checksums, which for some unknown reason have multiple versions in the archives. To be on the safe side, I check to make sure none of these packages is installed on any of my systems. If I find one installed, I remove it immediately, assuming it's either a trojan or corrupted package.
Once my holding archives are updated, I rsync the archives to another system serving as my working debian archive server, and this is the server most often used by local systems both to update their packages and run debsums against. The purpose of the duplicate archive is to ensure that I have a valid archive at all times even while one copy is being updated, and also in case one of the archive drives fails. (Even with DSL it would take many days to restore the lost debian archives.)
To update the working debian archive, I use the following script:
contents of script /mnt/install/test/copy-debian-archive: #!/bin/sh rsync -vaH --rsh=ssh --numeric-ids --delete /mnt/install/debian/ [EMAIL PROTECTED]:/mnt/install/debian/ rsync -vaH --rsh=ssh --numeric-ids --delete /mnt/install/debian-non-US/ [EMAIL PROTECTED]:/mnt/install/debian-non-US/ rsync -vacH --rsh=ssh --numeric-ids --delete /mnt/install/debian-security/ [EMAIL PROTECTED]:/mnt/install/debian-security/ rsync -vacH --rsh=ssh --numeric-ids --delete /mnt/install/debian-marillat/ [EMAIL PROTECTED]:/mnt/install/debian-marillat/
Notes: ibex is the hostname of my working debian archive server. debian-security and debian-marillat are again transfered with checksumming, because they don't have indices (md5sum) files to check them with after the fact. I run this script twice to fill in any missing hardlink targets missed on the first run.
The working debian archive server ibex also has copies of my scripts for making the .deb hardlinks, which I run and then diff the hardlinks over nfs as a double check. At this point I know that my two debian archive servers are identical and can be used interchangeably.
------------------------------------------------------------------------------------------
Before using the updated debian archives to update my local hosts, I check each .deb in the working debian archives using the following scripts:
contents of script /mnt/install/test/check-debian-archives: #!/bin/sh ./check-debian ../debian indices ./check-debian ../debian-non-US indices-non-US
contents of script /mnt/install/test/check-debian #!/bin/sh cat $1/$2/md5sums.gz |egrep -v '_arm|\-arm|\_alpha|\-alpha|hurd|powerpc|m68k|hppa|mips|mipsel|sparc|ia64|s390|potato|slink'| /mnt/install/test/md5chk $1
contents of script /mnt/install/test/md5chk: cd $1 while read md5 filep do #echo "$md5 $filep" filepath=`echo "$filep" | sed 's/"//g'`
if [ -h "$filepath" ] then testmd5="00000000000000000000000000000000" else if [ -f "$filepath" ] then testmd5=`md5sum "$filepath" | (read md5 filepath; echo $md5;)` else if [ -e "$filepath" ] then testmd5="00000000000000000000000000000001" else echo "$filepath" not found | tee >&2 echo continue fi fi fi
if [ "$testmd5" != "$md5" ] then echo "$filepath md5sums don't match" | tee >&2 echo "orig md5sum= $md5" | tee >&2 echo "test md5sum= $testmd5" | tee >&2 echo fi #echo "$testmd5 $filepath found" done
Notes: In this script I assign arbitrary md5sums of 0 or 1 respectively to non-standard files or directories, which I use with other scripts to generate and test my own md5sum files on various archives. Since only debian main and debian-non-US have md5 indices files, they are the only archives checked here. The others have been transfered with checksumming enabled, as noted above.
------------------------------------------------------------------------------------------
Next I update each of my local hosts over nfs, with "apt-get update" following by the update option of dselect using apt (not sure if that's necessary), followed by the select and install options of dselect to install or update any new packages. (I could use aptitude but I don't trust it yet.) Finally I run a debsums script on all of the local hosts to validate each file of any installed packages.
Each of my local systems has a copy of the following script to run debsums against a local debian archive:
contents of script check-debsums: #!/bin/sh debsums -ca --generate=all --deb-path=/mnt/$1/install/deblinks
The single argument is the name of one of my debian archive servers, either indio or ibex (holding or working archives, respectively.) with its debian archives mounted via nfs. This script regenerates the md5sums "on the fly" using the newly checked and updated debian archives, which increases my assurance of installed package integrity.
By running these scripts once or twice per week, I not only keep my systems up-to-date, but minimize the chance of a corrupted package remaining undetected on my systems. The only weak security link I can see is if someone were to trojan my debsums perl script. If I were more security conscious I could periodically boot a rescue floppy on each of my hosts and manually verify the md5sum of the debsums script. Any comments or suggestions are welcome.
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]