Your message dated Fri, 14 Dec 2007 20:26:43 +0100
with message-id <[EMAIL PROTECTED]>
and subject line Bug#454661: linux-image-2.6.22-3-686: locks up or crashes,
errors in tg3, psmouse and usb (2.6.22-2 worked fine)
has caused the attached Bug report to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere. Please contact me immediately.)
Debian bug tracking system administrator
(administrator, Debian Bugs database)
--- Begin Message ---
Subject: linux-image-2.6.22-3-686: locks up or crashes, errors in tg3, psmouse
and usb (2.6.22-2 worked fine)
Package: linux-image-2.6.22-3-686
Version: 2.6.22-6
Severity: critical
Justification: breaks the whole system
*** Please type your report below this line ***
Upgrading from kernel 2.6.22-2 to 2.6.22-3 caused my machine to become
very unstable, with random failures of the network interface (causing it
to disconnect from the network and then reconnect later), temporary or
permanent freezes of the display, and eventually some kernel panics.
These failures occured at random intervals going from a few minutes to
about one hour in the best case.
This machine has been running fine for the last few years with most of
the 2.6 kernels that entered Debian testing: 2.6.12-1, 2.6.15-1,
2.6.16-2, 2.6.17-2, 2.6.18-3, 2.6.18-4, 2.6.21-2, 2.6.22-2. The
version 2.6.22-3 that entered testing yesterday is the first one that
causes such problems leading to sudden crashes and data loss.
At first I blamed it on the upgrade of the proprietary fglrx kernel
module that was also upgraded at the same time (proprietary modules
are evil and always suspicious), but I rebooted several times without
the module and the system kept on crashing. It is only after
rebooting with the older kernel 2.6.22-2 that the problems stopped. I
could even rebuild and reload the newer fglrx module for the old
kernel and everything is still working fine. But as soon as I reboot
with the 2.6.22-3 kernel, the random freezes and crashes start again.
Here are some excerpts from /var/log/messages showing the problems that
occur with various devices, until there is eventually a kernel panic
(for which I have no message):
Dec 4 17:13:31 bora-bora -- MARK --
Dec 4 17:33:31 bora-bora -- MARK --
Dec 4 17:53:31 bora-bora -- MARK --
Dec 4 18:11:32 bora-bora kernel: tg3: eth0: Link is down.
Dec 4 18:11:35 bora-bora kernel: tg3: eth0: The system may be re-ordering
memory-mapped I/O cycles to the network device, attempting to recover. Please
report the problem to the driver maintainer and include system chipset
information.
Dec 4 18:11:36 bora-bora dhcdbd: dhco_parse_option_settings: bad option
setting: old_server_name =
Dec 4 18:11:38 bora-bora kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Dec 4 18:11:40 bora-bora kernel: tg3: eth0: Link is up at 100 Mbps, full
duplex.
Dec 4 18:11:40 bora-bora kernel: tg3: eth0: Flow control is off for TX and off
for RX.
Dec 4 18:11:40 bora-bora kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes
ready
[...]
Dec 4 18:53:10 bora-bora kernel: Clocksource tsc unstable (delta = 23435522509
ns)
Dec 4 18:53:10 bora-bora kernel: tg3: eth0: Link is down.
Dec 4 18:53:10 bora-bora kernel: psmouse.c: Wheel Mouse at
isa0060/serio2/input0 lost synchronization, throwing 3 bytes away.
Dec 4 18:53:10 bora-bora kernel: tg3: eth0: The system may be re-ordering
memory-mapped I/O cycles to the network device, attempting to recover. Please
report the problem to the driver maintainer and include system chipset
information.
[...]
Dec 5 18:05:30 bora-bora -- MARK --
Dec 5 18:25:30 bora-bora -- MARK --
Dec 5 18:37:18 bora-bora kernel: usb 2-2: USB disconnect, address 2
Dec 5 18:37:19 bora-bora kernel: usb 2-2: new low speed USB device using
uhci_hcd and address 3
Dec 5 18:37:19 bora-bora kernel: usb 2-2: configuration #1 chosen from 1 choice
Dec 5 18:37:19 bora-bora kernel: input: CHICONY Compaq USB Keyboard as
/class/input/input10
Dec 5 18:37:19 bora-bora kernel: input: USB HID v1.10 Keyboard [CHICONY Compaq
USB Keyboard] on usb-0000:00:1d.1-2
Dec 5 18:37:19 bora-bora kernel: input: CHICONY Compaq USB Keyboard as
/class/input/input11
Dec 5 18:37:19 bora-bora kernel: input,hiddev96: USB HID v1.10 Device [CHICONY
Compaq USB Keyboard] on usb-0000:00:1d.1-2
Dec 5 18:53:18 bora-bora kernel: tg3: eth0: Link is down.
Dec 5 18:53:18 bora-bora kernel: tg3: eth0: The system may be re-ordering
memory-mapped I/O cycles to the network device, attempting to recover. Please
report the problem to the driver maintainer and include system chipset
information.
Dec 5 18:53:19 bora-bora dhcdbd: dhco_parse_option_settings: bad option
setting: old_server_name =
Dec 5 18:53:22 bora-bora kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Dec 5 18:53:23 bora-bora kernel: tg3: eth0: Link is up at 100 Mbps, full
duplex.
Dec 5 18:53:23 bora-bora kernel: tg3: eth0: Flow control is off for TX and off
for RX.
Dec 5 18:53:23 bora-bora kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes
ready
[...]
Dec 5 19:31:22 bora-bora kernel: tg3: eth0: Link is down.
Dec 5 19:31:22 bora-bora kernel: tg3: eth0: The system may be re-ordering
memory-mapped I/O cycles to the network device, attempting to recover. Please
report the problem to the driver maintainer and include system chipset
information.
Dec 5 19:31:22 bora-bora dhcdbd: dhco_parse_option_settings: bad option
setting: old_server_name =
Dec 5 19:31:24 bora-bora kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Dec 5 19:31:26 bora-bora kernel: tg3: eth0: Link is up at 100 Mbps, full
duplex.
Dec 5 19:31:26 bora-bora kernel: tg3: eth0: Flow control is off for TX and off
for RX.
Dec 5 19:31:26 bora-bora kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes
ready
[...]
Note that in the examples above, the USB keyboard and the PS/2 mouse
were not disconnected, although the messages report that the keyboard
has been disconnected and almost immediately reconnected. All
messages shown here were usually followed by a kernel panic a few
minutes later.
I am now using the older kernel version 2.6.22-2 and everything works
fine again.
In case this additional information could be useful, this machine is a
HP/Compaq nc8000 laptop and lspci reports the following devices:
00:00.0 Host bridge: Intel Corporation 82855PM Processor to I/O Controller (rev
03)
00:01.0 PCI bridge: Intel Corporation 82855PM Processor to AGP Controller (rev
03)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #3 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
Controller (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 83)
00:1f.0 ISA bridge: Intel Corporation 82801DBM (ICH4-M) LPC Interface Bridge
(rev 03)
00:1f.1 IDE interface: Intel Corporation 82801DBM (ICH4-M) IDE Controller (rev
03)
00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus
Controller (rev 03)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 03)
00:1f.6 Modem: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97
Modem Controller (rev 03)
01:00.0 VGA compatible controller: ATI Technologies Inc RV350 [Mobility Radeon
9600 M10]
02:04.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC
(rev 01)
02:06.0 CardBus bridge: O2 Micro, Inc. OZ711M3/MC3 4-in-1 MemoryCardBus
Controller
02:06.1 CardBus bridge: O2 Micro, Inc. OZ711M3/MC3 4-in-1 MemoryCardBus
Controller
02:06.2 System peripheral: O2 Micro, Inc. OZ711Mx 4-in-1 MemoryCardBus
Accelerator
02:06.3 CardBus bridge: O2 Micro, Inc. OZ711M3/MC3 4-in-1 MemoryCardBus
Controller
02:0d.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000
Controller (PHY/Link)
02:0e.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705M_2 Gigabit
Ethernet (rev 03)
-- Package-specific info:
-- System Information:
Debian Release: lenny/sid
APT prefers testing
APT policy: (650, 'testing'), (50, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.22-2-686 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages linux-image-2.6.22-3-686 depends on:
ii initramfs-tools [linux-initr 0.90a tools for generating an initramfs
ii module-init-tools 3.3-pre11-4 tools for managing Linux kernel mo
Versions of packages linux-image-2.6.22-3-686 recommends:
pn libc6-i686 <none> (no description available)
-- debconf information:
linux-image-2.6.22-3-686/postinst/bootloader-test-error-2.6.22-3-686:
shared/kernel-image/really-run-bootloader: true
linux-image-2.6.22-3-686/preinst/lilo-initrd-2.6.22-3-686: true
linux-image-2.6.22-3-686/prerm/would-invalidate-boot-loader-2.6.22-3-686: true
linux-image-2.6.22-3-686/preinst/overwriting-modules-2.6.22-3-686: true
linux-image-2.6.22-3-686/preinst/elilo-initrd-2.6.22-3-686: true
linux-image-2.6.22-3-686/postinst/kimage-is-a-directory:
linux-image-2.6.22-3-686/preinst/lilo-has-ramdisk:
linux-image-2.6.22-3-686/postinst/old-initrd-link-2.6.22-3-686: true
linux-image-2.6.22-3-686/preinst/initrd-2.6.22-3-686:
linux-image-2.6.22-3-686/postinst/old-system-map-link-2.6.22-3-686: true
linux-image-2.6.22-3-686/preinst/already-running-this-2.6.22-3-686:
linux-image-2.6.22-3-686/postinst/depmod-error-2.6.22-3-686: false
linux-image-2.6.22-3-686/postinst/bootloader-error-2.6.22-3-686:
linux-image-2.6.22-3-686/preinst/failed-to-move-modules-2.6.22-3-686:
linux-image-2.6.22-3-686/postinst/old-dir-initrd-link-2.6.22-3-686: true
linux-image-2.6.22-3-686/preinst/abort-install-2.6.22-3-686:
linux-image-2.6.22-3-686/preinst/abort-overwrite-2.6.22-3-686:
linux-image-2.6.22-3-686/postinst/depmod-error-initrd-2.6.22-3-686: false
linux-image-2.6.22-3-686/preinst/bootloader-initrd-2.6.22-3-686: true
linux-image-2.6.22-3-686/prerm/removing-running-kernel-2.6.22-3-686: true
linux-image-2.6.22-3-686/postinst/create-kimage-link-2.6.22-3-686: true
--- End Message ---
--- Begin Message ---
On Fri, 14 Dec 2007, Raphaël Quinet wrote:
> Conclusion: this is not a kernel problem. This is a hardware problem and it
> was apparently a pure coincidence that it occurred first just after upgrading
> my kernel. It was also just pure luck that it did not occur again while I
> reverted to the older version, but it came back when I tried another time with
> the newer kernels (2.6.22-3 or 2.6.23). I suspect that the hardware became
> sentient and was actively trying to confuse me. But now that I found a way
> to reproduce the problem at will with a light twist of the laptop's frame, I
> know that the hardware is the culprit.
>
thanks raphael for letting us now.
closing the bug report.
happy hacking :)
--
maks
--- End Message ---