Hi All, I've been having a nightmare with a box that after a kernel update refuses to boot. Below is a transcript of what I've tried. -
Server is a Dell PowerEdge 2800 with 6Gb Ram, LSI Logic Perc 4e/DC RAID controller. After a kernel update and server fails to boot. Messages to the screen state that the kernel and initrd are loading, then "Booting Kernel" and the following: Kernel direct mapping tables 100000000 @ 8000-1100 It then hangs. - Selecting the old vmlinuz-2.6.18-4-amd64 kernel exhibits the same problem. - Booting the server with Gentoo livecd (kernel 2.6.15). ran e2fsck on all partitions. By mounting the filesystems and chrooting into the debian personality to access the system. Like so: modprobe dm-mod vgscan vgchange -a y mount /dev/sda3 /mnt/gentoo cd !$ mount /dev/sda1 boot/ for x in usr home var ;do mount /dev/mapper/vg1-$x $x;done mount -t proc proc proc/ mount -o bind /dev dev/ chroot . /bin/bash - With this in place I: Purged existing kernels 2.6.18-4-amd64 and 2.6.18-5-amd64 Installed 2.6.18-5-amd64 again (which forced an initrd image regeneration) Reinstalled grub into the boot block: # grub > root (hd0,0) > setup (hd0) > quit - Rebooted, now grub menu is not displayed. Not with a monitor attached either. - Rebooted gentoo disk and removed the serial console settings from /boot/grub/menu.lst and /etc/inittab. Rebooted and grub menu still does not display but shows a message. There is a stream of characters on the screen which fly by as if they are trying to be drawn. The final message is: (3-4 non-printable chars)Redirect console code - Reinstalled grub (the package) and reinstalled the boot block. Rebooted, same message. - Rebooted into the gentoo disk. Chrooted in to debian and created a grub floppy with: grub-floppy /dev/fd0 - Rebooted with the floppy and get a grub menu. Manually typing the kernel boot params into grub: root (hd0,0) kernel (hd0,0)/vmlinuz ro root=/dev/sda3 console=/dev/tty0 initrd (hd0,0)/initrd.gz boot Got same message as for original problem. Also tried with noapic, nolapic, acpi=off and memmap=exactmap [EMAIL PROTECTED] Same error. - Thinking it may be a problem with initrd generation I built a custom kernel which has no need for an initrd (i.e essential modules built in statically) aptitude install linux-source-2.6.18 kernel-package ncurses-dev fakeroot cd /usr/src tar xvjf linux-source-2.6.18.tar.bz2 ln -sfn linux-source-2.6.18 linux cp /boot/config linux/.config cd linux make menuconfig # set megaraid_mm|mbox & dm-mod as static make-kpkg clean date=`date '+%Y%m%d%H%M'` fakeroot make-kpkg --revision=buildtime${date}.Custom kernel-image cd .. dpkg -i linux-image-2.6.18-1_buildtime${date}.Custom_amd64.deb reboot same error! this took bloody ages too! - Going on a hunch with Alex Butcher we flashed the BIOS to version A06. Same error. - As a final ditch attempt to get services back up I ran the following commands before the chroot above. mount -t devpts devpts dev/pts chroot in then start up all services: cd /etc/rc2.d for x in S* ;do ./$x start ;done - Will attempt to carry on with this.... My feelings are that there is a problem with either the RAID or filesystem on /boot. I'd like to stick the kernel images on to a USB memory stick and try to boot with that. That should confirm if there is a disk level problem. If it is then we will need to migrate mail onto another box, destroy and rebuild the array, and attempt to reinstall a fresh copy. It might be worth copying the kernels (system image) to another box to see if it boots elsewhere. Any ideas are greatly appreciated. Matt -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]