On Tue, Oct 6, 2009 at 1:26 AM, Ben Hutchings <b...@decadent.org.uk> wrote:
> On Mon, 2009-10-05 at 14:05 +0200, Jens-Michael Hoffmann wrote: > > On Monday, 5. October 2009 01:29:39 Ben Hutchings wrote: > > > On Mon, 2009-10-05 at 00:15 +0100, Antonio Marcos López Alonso wrote: > > > > > Is this a new problem or did it occur with earlier kernel versions? > > > > > > > > No, it happened also in previous versions. But at least > irqpoll/irqfixup > > > > worked pretty well. Now this behavior seems to get worsened even > using > > > > these kernel options. > > > > > > > > > Can you try to reproduce this without the nvidia or virtualbox > modules > > > > > loaded? > > > > > > > > I can but just to make things faster: > > > > > > > > Jens-Michael, > > > > > > > > Have you got any nvidia/virtualbox modules running in your host? Just > to > > > > discard... > > > > > > The warning message shows all loaded modules, and those aren't > included, > > > so this question is answered. > > > > > > I had a look at the code and the values in the 'transmit timed out' > > > message, and it seems that the NIC has reported a transmit completion > > > but this hasn't been handled. Perhaps another device sharing its IRQ > is > > > misbehaving and causing the IRQ to be disabled. Please can you send > > > more of the kernel log from before the TX watchdog warning? Also, if > > > this happens again, please send the contents of /proc/interrupts. > > > > /proc/interrupts: > > CPU0 CPU1 > > 0: 42 0 IO-APIC-edge timer > > 1: 0 82 IO-APIC-edge i8042 > > 8: 0 0 IO-APIC-edge rtc0 > > 9: 0 0 IO-APIC-fasteoi acpi > > 14: 0 109 IO-APIC-edge ide0 > > 17: 5 581 IO-APIC-fasteoi firewire_ohci > > 18: 350432 19112694 IO-APIC-fasteoi eth1 > > 20: 6365 143067 IO-APIC-fasteoi sata_via > > 21: 0 0 IO-APIC-fasteoi uhci_hcd:usb1, > ehci_hcd:usb2, uhci_hcd:usb3, uhci_hcd:usb4, uhci_hcd:usb5 > > 23: 150373 6348653 IO-APIC-fasteoi eth2 > [...] > > OK, that seems to rule out my first hypothesis. > > Could you try adding 'noapic' to the kernel command line? > Sure. Here is /proc/interrupts using irqpoll + noapic (yet no much time has passed to reproduce the failure): CPU0 0: 32 XT-PIC-XT timer 1: 1084 XT-PIC-XT i8042 2: 0 XT-PIC-XT cascade 3: 1 XT-PIC-XT 4: 23569 XT-PIC-XT ehci_hcd:usb1, uhci_hcd:usb8 5: 63936 XT-PIC-XT ehci_hcd:usb2, uhci_hcd:usb4, uhci_hcd:usb5, uhci_hcd:usb7, eth0 6: 5 XT-PIC-XT floppy 7: 412 XT-PIC-XT parport0 8: 0 XT-PIC-XT rtc0 9: 0 XT-PIC-XT acpi 10: 409 XT-PIC-XT nvidia 11: 30797 XT-PIC-XT uhci_hcd:usb3, uhci_hcd:usb6, sata_via, HDA Intel 12: 15090 XT-PIC-XT i8042 14: 3043 XT-PIC-XT ide0 15: 0 XT-PIC-XT ide1 NMI: 0 Non-maskable interrupts LOC: 117311 Local timer interrupts SPU: 0 Spurious interrupts RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts ERR: 1 MIS: 0 Antonio