Hi,
another zinger that I am sending to LKML because I don't know where else to send it ...
I've discovered a deadly combination of kernel & lilo (and raid). This may be a pure
lilo bug, but I assume that the kernel+raid aids & abets the problem...:
I am running kernel-2.4.x. Two ide hard drives, with partitions 1,5,6,7,8 in use.
The partitions on the two drives are mirrored using RAID-1 to create /dev/md1,
/dev/md5,
/dev/md6, etc. The root fs is on /dev/md1. Thus, lilo.conf looks like:
boot=/dev/md1
map=/boot/map
install=/boot/boot.b
prompt
timeout=50
linear
default=linux
image=/boot/vmlinuz-2.4.2
label=linux
read-only
root=/dev/md1
For nearly a year, this combo has worked just fine (running 2.3.99 back then).
Just fine, that is, using the redhat-6.2 rpm for LILO, i.e. version
lilo-0.21-15.i386.rpm which reports itself to be:
% /sbin/lilo -V
LILO version 21
Recently, this machine went over to debian-unstable from redhat:
% dpkg -s lilo
Package: lilo
Status: install ok installed
Priority: important
Section: base
Installed-Size: 271
Maintainer: Russell Coker <[EMAIL PROTECTED]>
Version: 1:21.7-3
Depends: libc6 (>= 2.2.1-2), debconf (>= 0.2.26), logrotate
The debian version of lilo writes a boot sector that hangs hard for the above
kernel+raid+lilo.conf configuration: specifically:
LIL- after a reboot. Needless to say, recovery was painful. But I was able to
verify
that the redhat lilo rpm always functioned correctly, and the debian-unstable dpkg
always
hung in this way. Although at one point, during my twisting & turning, I got the
debian lilo
to get to only LI before hanging. I have no idea of what I did different to get to
that as
opposed to LIL-
BTW, I noticed that oddly, every time I ran lilo, and then ran lilo -q -v -v, it
reported
different sector numbers for the kernel images. This freaked me out at first, but I
came to
accept it as normal: doesn't affect bootability. But is this really w.a.d? (I was
assuming, appearently erroneously, that lilo -q -v -v was reporting the physical
location
of the kernel image on the disk; but since the numbers bounce around, that can't be
right.
Or is this just weird bios head/cyl/sect math flakiness?)
--linas
PGP signature