Greg --
There is a problem in usbserial.c in serial_write (and
possibly other usbserial functions) that causes an
oops--Scheduling in interrupt.
I think this is the same problem we talked about a few
months ago, but now I see the cause of the oops. It is
not in the usb serial drivers themselves, but in usbserial.c.
Serial_write gets the port->sem semaphore, but serial_write
can be called in interrupts and can't sleep. If that semaphore
is held somewhere else and an serial_write gets called in
an interrupt--oops!
I am seeing this problem with the io_ti.c driver that I am
working on right now. I can reliably reproduce it by using
a loopback, setting "stty echo" on the port, and catting
a big file to the port.
io_ti.c and several other usb serial drivers set
tty->low_latency, which means that the serial subsystem
will be called from the usb read callbacks in interrupt
time. Combine this with the stty echo option that writes
each character back out as it comes in, and serial_write
is constantly called in interrupt time. If there is any
other usbserial activity on that port (like writing to
it directly as in my test), it is easy to get the oops.
The attached oops is just an example. It happens to be
in the 2.4.19 kernel, but the usbserial and io_ti.c have
been backported from 2.4.23. The stack backtrace seems
to show the problem--in serial write (presumably holding
the semaphore), get an interrupt, process an urb, call
tty_flip_buffer_push/flush_to_ldisc/n_tty_receive_buf/opost
which does a write through serial_write which tries to
get the semaphore again.
We simply can't be using semaphores in interrupt code
like serial_write. We need to look at how usbserial.c
can better handle synchronization.
-- Al
ksymoops 2.4.9 on i686 2.4.19. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.19/ (default)
-m /usr/src/linux/System.map (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.
kernel BUG at sched.c:566!
invalid operand: 0000
CPU: 0
EIP: 0010:[<c0115113>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 00000018 ebx: c4e42878 ecx: 00000006 edx: 00000000
esi: d0aec000 edi: d0aec000 ebp: d0aedc18 esp: d0aedbf4
ds: 0018 es: 0018 ss: 0018
Process cat (pid: 5501, stackpage=d0aed000)
Stack: c02067ea 00000082 d0aec000 00000000 c764f720 c77fe000 c4e42878 d0aec000
ffffffea d0aedc38 c01078e4 00000001 d0aec000 c4e42884 c4e42884 c4e4281c
c4e42800 d0aedc4c c0107a47 c4e42878 c4e42800 ffffffed d0aedc60 d887a1f6
Call Trace: [<c01078e4>] [<c0107a47>] [<d887a1f6>] [<c016aec7>] [<c016b9c7>]
[<c0120139>] [<d8861574>] [<d88616a3>] [<c016a843>] [<d88c68ab>] [<d88637cc>]
[<d88638b9>] [<d88639cc>] [<c0109f7d>] [<c010a10b>] [<c010c5b8>] [<d8870018>]
[<d8878575>] [<c016aadb>] [<c016b05e>] [<c016ce21>] [<c0168d17>] [<c016cce0>]
[<c0135338>] [<c0108b9b>]
Code: 0f 0b 36 02 e2 67 20 c0 58 8b 45 e8 c1 e0 05 05 40 8b 27 c0
>>EIP; c0115113 <schedule+53/2f0> <=====
>>ebx; c4e42878 <_end+4b8c80c/18598f94>
>>esi; d0aec000 <_end+10835f94/18598f94>
>>edi; d0aec000 <_end+10835f94/18598f94>
>>ebp; d0aedc18 <_end+10837bac/18598f94>
>>esp; d0aedbf4 <_end+10837b88/18598f94>
Trace; c01078e4 <__down+54/a0>
Trace; c0107a47 <__down_failed+b/14>
Trace; d887a1f6 <[usbserial].text.lock.usbserial+41/ea>
Trace; c016aec7 <opost+17/1c0>
Trace; c016b9c7 <n_tty_receive_buf+357/c80>
Trace; c0120139 <send_signal+29/100>
Trace; d8861574 <[usb-uhci]uhci_submit_bulk_urb+3f4/410>
Trace; d88616a3 <[usb-uhci]uhci_clean_transfer+83/190>
Trace; c016a843 <flush_to_ldisc+e3/f0>
Trace; d88c68ab <[ext3].data.end+2582c/33f81>
Trace; d88637cc <[usb-uhci]process_urb+dc/200>
Trace; d88638b9 <[usb-uhci]process_urb+1c9/200>
Trace; d88639cc <[usb-uhci]uhci_interrupt+dc/140>
Trace; c0109f7d <handle_IRQ_event+3d/70>
Trace; c010a10b <do_IRQ+7b/c0>
Trace; c010c5b8 <call_do_IRQ+5/d>
Trace; d8870018 <[jbd].rodata.end+26b9/2911>
Trace; d8878575 <[usbserial]serial_write+75/110>
Trace; c016aadb <tty_default_put_char+1b/20>
Trace; c016b05e <opost+1ae/1c0>
Trace; c016ce21 <write_chan+141/1f0>
Trace; c0168d17 <tty_write+187/200>
Trace; c016cce0 <write_chan+0/1f0>
Trace; c0135338 <sys_write+98/f0>
Trace; c0108b9b <system_call+33/38>
Code; c0115113 <schedule+53/2f0>
00000000 <_EIP>:
Code; c0115113 <schedule+53/2f0> <=====
0: 0f 0b ud2a <=====
Code; c0115115 <schedule+55/2f0>
2: 36 ss
Code; c0115116 <schedule+56/2f0>
3: 02 e2 add %dl,%ah
Code; c0115118 <schedule+58/2f0>
5: 67 20 c0 addr16 and %al,%al
Code; c011511b <schedule+5b/2f0>
8: 58 pop %eax
Code; c011511c <schedule+5c/2f0>
9: 8b 45 e8 mov 0xffffffe8(%ebp),%eax
Code; c011511f <schedule+5f/2f0>
c: c1 e0 05 shl $0x5,%eax
Code; c0115122 <schedule+62/2f0>
f: 05 40 8b 27 c0 add $0xc0278b40,%eax
<0>Kernel panic: Aiee, killing interrupt handler!
1 warning issued. Results may not be reliable.