Reports like this should always be sent to the mailing list as well as 
to me.  And please instruct your mail client to wrap lines after 72 
columns or so -- long lines are hard to deal with.

On Fri, 1 Feb 2013, Mark Ferrell wrote:

> Hello Alan,
> 
> I have been attempting to debug a hang with the OHCI layer when I ran across 
> your August conversation with Tomas Sokorai regarding similar behaviour, 
> though the behaviour did not exist for us in 2.6.27 on the same hardware.

What version of the kernel are you using?

> The hardware in question is an AMD Geode LX900 with 3 FTDI Dual-Port USB 
> Serial interfaces attached to the OHCI bus.  We have an application which 
> manages the serial ports and at some point the ports stop responding.  If we 
> attempt to close() the ports the application will hang and attempting to kill 
> -9 the application triggers a whole slew of issues within the kernel.  For 
> example,  since the USB device in question is a tty device the 'ps' goes into 
> D state while attempting to display the Applications process information.  I 
> presume their is secondary contention in tty_lock() at this point.

The problem probably is connected with the OHCI controller and not with 
the FTDI devices.  Once the problem occurs, it is too late to do any 
debugging.

> Like Tomas we noticed that the OHCI interface "feels" very slow.  We also 
> noticed a physical performance drop in the system as a whole when disabling 
> the EHCI interface on an Intel E660/620 and enabling the OHCI to run 
> comparison tests against the Geode to see if the Intel would also lock-up.
> 
> Unfortunately I do not have a reliable way to reproduce the bug outside of 
> running test-cases on multiple systems waiting for the ports to hang.  Any 
> advice/pointers/insight would be appreciated.

You should build a test kernel with CONFIG_USB_DEBUG enabled.  When the 
problem occurs, post the output from the "dmesg" command -- not the 
contents of the system log file.

>     Feb  1 18:15:40 A303152068 kernel: [ 8404.370974] INFO: task java:3677 
> blocked for more than 120 seconds.
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.378724] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.381100] java            D 
> c103a90b     0  3677      1 0x00000004
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.398733]  dd72fe54 00000096 
> dd72fe20 c103a90b c1444530 dd512d80 de2690c0 dd6e0100
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.435612]  00000046 dd512d80 
> c103a90b dd72fe2c 00000046 c1444520 00000001 c1444520
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.476415]  00000246 dd72fe2c 
> c1052801 dd72fe40 00000246 dd72fe70 dd72fe70 c1444520
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.513915] Call Trace:
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.515689]  [<c103a90b>] ? 
> prepare_to_wait+0x4f/0x56
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.521019]  [<c103a90b>] ? 
> prepare_to_wait+0x4f/0x56
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.526174]  [<c1052801>] ? 
> trace_hardirqs_on+0xb/0xd
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.531238]  [<c103a90b>] ? 
> prepare_to_wait+0x4f/0x56
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.536154]  [<c129677d>] 
> schedule+0x3b/0x3d
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.540189]  [<c11c5421>] 
> usb_kill_urb+0x86/0x9f
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.544126]  [<c103a741>] ? 
> wake_up_bit+0x1b/0x1b
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.548990]  [<c11d4250>] 
> generic_cleanup+0x98/0xa4
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.552549]  [<c11d42d5>] 
> usb_serial_generic_close+0x35/0x3a
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.556484]  [<df11dc00>] 
> ftdi_close+0x3b/0x54 [ftdi_sio]
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.561561]  [<c11d2133>] 
> serial_down+0x23/0x25
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.564128]  [<c1177e76>] 
> tty_port_shutdown+0x3c/0x4a
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.568926]  [<c11785d2>] 
> tty_port_close+0x19/0x37
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.574519]  [<c11d297e>] 
> serial_close+0x44/0x4c
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.579157]  [<c117265e>] 
> tty_release+0x98/0x2e9
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.586850]  [<c109b7ab>] 
> __fput+0xdc/0x17a
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.589187]  [<c109b954>] 
> fput+0x17/0x19
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.592384]  [<c1098d7f>] 
> filp_close+0x57/0x61
>     Feb  1 18:15:40 A303152068 kernel: [ 8404.598120]  [<c1099e69>] 
> sys_close+0x6d/0xa8
>     Feb  1 18:15:41 A303152068 kernel: [ 8404.603033]  [<c12976e5>] 
> syscall_call+0x7/0xb
>     Feb  1 18:15:41 A303152068 kernel: [ 8404.608536]  [<c1290000>] ?  
> quirk_triton+0x7/0x25

This is useless.  We need to see what happens when the problem occurs, 
not 2 minutes later.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to