No bug for this, mostly just getting this info out there somewhere in the hopes that if someone else is trying to do this it might be of some help.
Configuration: 2.4.27 standard debian kernel with devfs=mount /dev/md/0 == swap /dev/md/1 == /boot /dev/md/6 + /dev/md/7 == shaktivg /dev/shaktivg/rootlv == /root The MDs are raid1. mkinitrd creats a flawless initrd for the 2.4.x kernel (big step up from some time ago where I had to drop in scripts for setting up lvm). 2.6.9 standard debian kernel with devfs=mount I didn't spend a lot of time with this setup (devfs is deprecated and udev seems like the right way to go). Interesting that in order for lvm to work we need devfs compiled in and we have to mount devfs temporarily. Same for md it seems (to get to the md device for mdadm). So, even though *I'm* not using devfs the system is. I probably should have just left things alone with devfs=mount. 2.6.9 standard kernel with no devfs=mount. This is where things started going downhill. 1) I've always used devfs on this system. So, /dev (the underlying dev directory) was completely empty. This resulted in all kinds of mischief. For instance, /dev/console was not in the initrd. So, in init where /dev/console is used I'd get an error that dev/console is read-only (shell can't create the file dev/console for redirection) and so it fails. I also lacked /dev/hd* and /dev/md*. 2) LVM isn't smart enuf to ignore RAID1 components. This means that unless you filter out the underlying devices (/dev/hd*) it will grab the first partition that looks like an LVM component and then complain about duplication UUIDs for the others. I didn't see a rational order to which components it scans but it always saw the /dev/hd* ones before the /dev/md* ones). 3) Creating the mkinitrd under 2.4.x resulted in trying to assemble the raid using devfs device names. You'd think this would work fine since the device nodes are copied over in the initrd too. For some reason it wasn't for me. After starting, mkinitrd under 2.6 would create a initrd without the raid because lvm had bypassed the raid device and used the underlying components. So, the initrd would be created with no madadm stuff. "oops". 4) LVM seemed to be consistent in which "side" of the mirror it chose, so I booted into a CD that supported RAID+LVM and failed the drives that LVM chose. 5) Continued trying to make LVM work. After much playing (I had the syntax wrong for the filter) I finally got it to ignore all but /dev/md?. Then make a new mkinitrd, expanded the initrd, manually added the raid stuff, rebooted/edited a few times to fix stupid mistakes, realized that lvm suddenly switched which drives it was stealing for itself <sigh> so my saved mkinitrd work area kept switching from one version to another, but finally got a working initrd. 6) Added the failed drives back into the mirror and resynched. So, what prompted all this? Needed to hook up an aditional fax modem. Purchased a USB<-->serial adapter (prolifics 2303 based). Could not get it to work on 2.4.x. Thought to try it on 2.6. Ran out of space in root due to bloat in /lib/modules/*. I resized the root LV which was at 100%. /etc/lvm/* is on root (/) and so a backup of the VG was not able to be created. No biggie I thought, never had problems. My UPS then decided to take a dump (turned out to be a batt problem). Hooked up the new UPS (already had the new one, just no scheduled downtime to hook it up), and tried to boot. Unable to mount root. Boot from recovery CD and inspect. XFS knows root is supposed to be 250M (new size) but LVM is showing it as only 150M (old size). Since the VG backup is in root I can't restore the metadata (I now keep another copy in var and will probably also keep a copy in /boot (no LVM dependancy there). I had a level 0 backup by amanda of root from a day before, so I did a manual restore of that from a boot CD that supports xfs+lvm. I had to reinstall some of the packages I put in for 2.6 support (udev, hotplug, etc). The rest of the saga is above. I'm not sure there is much that the debian packages can do to do a better job in my case. The single biggest problems I ran into were: 1) No real dev when not running devfs which caused all kinds of problems especially for the initrd. 2) lvm stealing the underlying components of the RAID1 device. This is really scary because it 'works' but will result in corruption. I had to xfs_repair my filesystems -- luckily I made no real changes to the FSes while in this state so it didn't cause me too much problems. This is fairly silent though and the system appears to work. Only because I was having other problems did I notice this problem and spend time/effort addressing it. The only real indication of a problem (other than symptoms down the road) were the duplicate UUID warnings from the lvm tools. -- -Rupa
smime.p7s
Description: S/MIME Cryptographic Signature