On Thursday 12 May 2005 02:30 pm, Marty wrote: > As the debian archive grows larger, it gets increasingly laborious > and time consuming to keep my local debian archive up-to-date. Here > are my latest scripts for automating the process (including some > remaining manual steps). > > I'm sure there are better ways to do it, which is one of my reasons > for posting them here. In particular, I am interested in exploring > the use the debmirror to r.eplace rsync in these scripts, although I'm > not familiar enough with it yet to know how well that might work.
quite a lot of script :-) Using 'debmirror' will put a full Debian mirror into your home directory, not what you seem to want but an example. I am not sure if 'debmirror' works with non-official sites but it does a good job mirroring the Debian archive, and you can limit the arch's and versions easily, not end up with 100+ gb :-) > > (Note: these scripts work for x86 debian archives, and will need to > be modified accordingly for other architectures. In addition, there > are probably more elegant ways to do these tasks, by consolidating > them instead of using several small scripts. Any proposals are > welcome.) > > For planning purposes, here is the disk space used by my debian > archives as of May 12, 2005: > > indio:/mnt/install# du -sc debian debian-security debian-non-US > debian-marillat 37373196 debian > 1907560 debian-security > 267864 debian-non-US > 1277720 debian-marillat > 40826340 total > > I have a script called debian-all which rsyncs all the debian > archives into a holding archive at /mnt/install/debian[-*] (*=blank > (main), security, non-US, or marillat) > > contents of /mnt/install/test/debian-all: > #!/bin/sh > LOOP=1 > while [ "$LOOP" = 1 ] > do > if rsync -vaHD --numeric-ids --delete --delete-excluded > --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude > '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude > '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' > --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude > '*UploadQueue*' rsync://ftp.debian.org/debian/ /mnt/install/debian/ > > then LOOP=0 > else > echo rsync error: trying debian main again > sleep 10 > fi > done > > echo > echo > > #do debian main again to fill in any hardlink targets missed > #the first time 'round > > LOOP=1 > while [ "$LOOP" = 1 ] > do > if rsync -vaHD --numeric-ids --delete --delete-excluded > --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude > '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude > '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' > --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude > '*UploadQueue*' rsync://ftp.debian.org/debian/ /mnt/install/debian/ > > then LOOP=0 > else > echo rsync error: trying debian main again > sleep 10 > fi > done > > echo > echo > > LOOP=1 > while [ "$LOOP" = 1 ] > do > > if rsync -vaHD --numeric-ids --delete --delete-excluded > --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude > '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude > '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' > --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude > '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' --exclude > 'slink' rsync://non-us.debian.org/debian-non-US/ > /mnt/install/debian-non-US/ then LOOP=0 > else > echo rsync error: trying debian-non-US again > sleep 10 > fi > done > > echo > echo > > LOOP=1 > while [ "$LOOP" = 1 ] > do > > if rsync -vaHD --numeric-ids --delete --delete-excluded > --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude > '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude > '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' > --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude > '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' --exclude > 'slink' rsync://security.debian.org/debian-security/ > /mnt/install/debian-security/ then LOOP=0 > else > echo rsync error: trying debian-security again > sleep 10 > fi > done > > echo > echo > > #rsync debian-security again, this time with checksums (-c option), > because there is #no indices file there containing md5sums to check > file integrity with after the fact. #Note: update with checksums only > after first updating without them, because #this server has a strong > tendency to give time out errors during large transfers > > LOOP=1 > while [ "$LOOP" = 1 ] > do > > if rsync -vcaHD --numeric-ids --delete --delete-excluded > --exclude '*ia64*' --exclude '*_arm*' --exclude '*_alpha*' --exclude > '*-arm*' --exclude '*-alpha*' --exclude '*powerpc*' --exclude > '*mipsel*' --exclude '*hppa*' --exclude '*m68k*' --exclude '*mips*' > --exclude '*sparc*' --exclude '*s390*' --exclude '*hurd*' --exclude > '*UploadQueue*' --exclude 'oldstable' --exclude 'potato' --exclude > 'slink' rsync://security.debian.org/debian-security/ > /mnt/install/debian-security/ then LOOP=0 > else > echo "rsync error: trying debian-security (w/csums) again" > sleep 10 > fi > done > > echo > echo > > wget -nv -r ftp://ftp.nerim.net/debian-marillat/ -nH -N -P > /mnt/install > > #end of debian-all > > (Note: At first I thought that the "if rsync .." statements should be > "if ! rsync ..." instead, but that logic doesn't seem to work, for > reasons that are unclear to me.) > > --------------------------------------------------------------------- >-------------------- > > I set up my holding archive server to boot up by RTC alarm at 6:10am. > This allows enough time to fsck any disks prior to the cron.daily > wake-up time at 6:25am. In the directory /etc/cron.daily I placed a > script named udpate-debian. > > contents of /etc/cron.daily/update-debian: > #!/bin/sh > LOGFILE=/var/tmp/update-debian.log > /mnt/install/test/debian-all >$LOGFILE 2>&1 > echo >>$LOGFILE > echo update-debian cron script >>$LOGFILE > echo finished with archive update at `date` >>$LOGFILE > mail -s "update-debian.log for `date`" [EMAIL PROTECTED] > </var/tmp/update-debian.log while true > do > #if updatedb is still running, wait for it to finish before shutting > down pidof updateb && echo waiting for updatedb to finish at `date` > >>$LOGFILE || shutdown -h now echo >>$LOGFILE > sleep 300 > done > > --------------------------------------------------------------------- >----------------- > > The remaining steps could also be automated, but for now I prefer to > do them manually for now. > > In order to simplify checking with debsums, I keep a directory named > /mnt/install/deblinks, containing hard links to all the .deb files in > the local archive. To update this directory, I first do "rm > /mnt/install/deblinks.old;mv /mnt/install/deblinks > /mnt/install/deblinks.old;mkdir /mnt/install/deblinks". > > Now I am ready to put new .deb hardlinks in the /mnt/install/deblinks > directory, using the following scripts: > > contents of /mnt/install/test/make-deblinks: > #!/bin/sh > find /mnt/install/debian* -regex .*\\.deb$ | > /mnt/install/test/deblink-loop > > contents of mnt/install/test/deblink-loop: > #!/bin/sh > cd /mnt/install/deblinks > while read filepath > do > file=`echo $filepath | sed 's/.*\///'` > ln $filepath $file > done > --------------------------------------------------------------------- >--------------- > > Next I check for duplicate .deb files with differing md5 checksums > using the following scripts: > > contents of /mnt/install/test/diff-dupes: > #!/bin/sh > /mnt/install/test/deb-dupes | /mnt/install/test/check-dupes > > contents of /mnt/install/test/deb-dupes: > #!/bin/sh > find /mnt/install/debian* -regex .*\\.deb$ -printf %f\\n |sort|uniq > -d > > contents of /mnt/install/test/check-dupes: > #!/bin/sh > while read file > do > find /mnt/install/debian* -name $file |xargs md5sum|sort|uniq -uW1 > done > > --------------------------------------------------------------------- >------------- > > The output of diff-dupes is a handful of .debs with their respective > md5 checksums, which for some unknown reason have multiple versions > in the archives. To be on the safe side, I check to make sure none > of these packages is installed on any of my systems. If I find one > installed, I remove it immediately, assuming it's either a trojan or > corrupted package. > > Once my holding archives are updated, I rsync the archives to another > system serving as my working debian archive server, and this is the > server most often used by local systems both to update their packages > and run debsums against. The purpose of the duplicate archive is to > ensure that I have a valid archive at all times even while one copy > is being updated, and also in case one of the archive drives fails. > (Even with DSL it would take many days to restore the lost debian > archives.) > > To update the working debian archive, I use the following script: > > contents of script /mnt/install/test/copy-debian-archive: > #!/bin/sh > rsync -vaH --rsh=ssh --numeric-ids --delete /mnt/install/debian/ > [EMAIL PROTECTED]:/mnt/install/debian/ rsync -vaH --rsh=ssh > --numeric-ids --delete /mnt/install/debian-non-US/ > [EMAIL PROTECTED]:/mnt/install/debian-non-US/ rsync -vacH --rsh=ssh > --numeric-ids --delete /mnt/install/debian-security/ > [EMAIL PROTECTED]:/mnt/install/debian-security/ rsync -vacH --rsh=ssh > --numeric-ids --delete /mnt/install/debian-marillat/ > [EMAIL PROTECTED]:/mnt/install/debian-marillat/ > > Notes: ibex is the hostname of my working debian archive server. > debian-security and debian-marillat are again transfered with > checksumming, because they don't have indices (md5sum) files to check > them with after the fact. I run this script twice to fill in any > missing hardlink targets missed on the first run. > > The working debian archive server ibex also has copies of my scripts > for making the .deb hardlinks, which I run and then diff the > hardlinks over nfs as a double check. At this point I know that my > two debian archive servers are identical and can be used > interchangeably. > > --------------------------------------------------------------------- >--------------------- > > Before using the updated debian archives to update my local hosts, I > check each .deb in the working debian archives using the following > scripts: > > contents of script /mnt/install/test/check-debian-archives: > #!/bin/sh > ./check-debian ../debian indices > ./check-debian ../debian-non-US indices-non-US > > contents of script /mnt/install/test/check-debian > #!/bin/sh > cat $1/$2/md5sums.gz |egrep -v > '_arm|\-arm|\_alpha|\-alpha|hurd|powerpc|m68k|hppa|mips|mipsel|sparc| >ia64|s390|potato|slink'| /mnt/install/test/md5chk $1 > > contents of script /mnt/install/test/md5chk: > cd $1 > while read md5 filep > do > #echo "$md5 $filep" > filepath=`echo "$filep" | sed 's/"//g'` > > if [ -h "$filepath" ] > then > testmd5="00000000000000000000000000000000" > else > if [ -f "$filepath" ] > then > testmd5=`md5sum "$filepath" | (read md5 filepath; > echo $md5;)` else > if [ -e "$filepath" ] > then > testmd5="00000000000000000000000000000001" > else > echo "$filepath" not found | tee >&2 > echo > continue > fi > fi > fi > > if [ "$testmd5" != "$md5" ] > then > echo "$filepath md5sums don't match" | tee >&2 > echo "orig md5sum= $md5" | tee >&2 > echo "test md5sum= $testmd5" | tee >&2 > echo > fi > #echo "$testmd5 $filepath found" > done > > > Notes: In this script I assign arbitrary md5sums of 0 or 1 > respectively to non-standard files or directories, which I use with > other scripts to generate and test my own md5sum files on various > archives. Since only debian main and debian-non-US have md5 indices > files, they are the only archives checked here. The others have been > transfered with checksumming enabled, as noted above. > > --------------------------------------------------------------------- >--------------------- > > Next I update each of my local hosts over nfs, with "apt-get update" > following by the update option of dselect using apt (not sure if > that's necessary), followed by the select and install options of > dselect to install or update any new packages. (I could use aptitude > but I don't trust it yet.) Finally I run a debsums script on all of > the local hosts to validate each file of any installed packages. > > Each of my local systems has a copy of the following script to run > debsums against a local debian archive: > > contents of script check-debsums: > #!/bin/sh > debsums -ca --generate=all --deb-path=/mnt/$1/install/deblinks > > The single argument is the name of one of my debian archive servers, > either indio or ibex (holding or working archives, respectively.) > with its debian archives mounted via nfs. This script regenerates > the md5sums "on the fly" using the newly checked and updated debian > archives, which increases my assurance of installed package > integrity. > > By running these scripts once or twice per week, I not only keep my > systems up-to-date, but minimize the chance of a corrupted package > remaining undetected on my systems. The only weak security link I > can see is if someone were to trojan my debsums perl script. If I > were more security conscious I could periodically boot a rescue > floppy on each of my hosts and manually verify the md5sum of the > debsums script. Any comments or suggestions are welcome. -- Greg C. Madden -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]