Hi, Alexander Kurtz wrote: > On Wed, 2011-06-22 at 03:40 +0100, Ben Hutchings wrote:
>> The panic message shows there was an earlier kernel warning; please can >> you provide that. > > Thanks to netconsole (a really great tool!) I was able to so. The > attached kernel log starts right before I plug the drive in. > Surprisingly the kernel didn't crash the first time, but after trying > again, everything went as expected (see lines 17 and 35). Sorry for the long silence. Let's see: > [ 1421.182657] sd 7:0:0:0: [sdc] Attached SCSI disk > [ 1454.865926] WARNING! power/level is deprecated; use power/control instead Seems harmless enough. > [ 1478.728383] sd 8:0:0:0: [sdc] Attached SCSI disk > [ 1491.693027] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000048 > [ 1491.693229] IP: [<ffffffff8118b2e3>] elv_completed_request+0x38/0x47 The panic. [...] > [ 1491.696825] Code: 40 74 35 83 7e 44 01 74 04 a8 40 74 2b 83 e0 11 ff c8 0f > 95 c0 83 e0 01 48 05 fc 00 00 00 ff 4c 87 04 f6 46 41 04 74 10 48 8b 02 > [ 1491.696825] 8b 40 48 48 85 c0 74 04 41 58 ff e0 59 c3 48 8d be 80 00 00 > [ 1491.696825] RIP [<ffffffff8118b2e3>] elv_completed_request+0x38/0x47 Disassembly, for convenience (following the hints from Documentation/oops-tracing.txt): | <+0>: rex je 0x6008b8 <str+56> | <+3>: cmpl $0x1,0x44(%rsi) | <+7>: je 0x60088d <str+13> | <+9>: test $0x40,%al | <+11>: je 0x6008b8 <str+56> | <+13>: and $0x11,%eax | <+16>: dec %eax | <+18>: setne %al | <+21>: and $0x1,%eax | <+24>: add $0xfc,%rax | <+30>: decl 0x4(%rdi,%rax,4) | <+34>: testb $0x4,0x41(%rsi) | <+38>: je 0x6008b8 <str+56> | <+40>: mov (%rdx),%rax | <+43>: cmp %ah,0x40(%rdx) | <+46>: rex.W | <+47>: test %rax,%rax | <+50>: je 0x6008b8 <str+56> | <+52>: pop %r8 | <+54>: jmpq *%rax | <+56>: pop %rcx | <+57>: retq | <+58>: lea 0x80(%rsi),%rdi So offset 0x38 is the jump in if ((rq->cmd_flags & REQ_SORTED) && As for why that involves an access to the address 0x48: well, that is beyond my depth. rq->cmd_flags was already accessed in the check if (blk_account_rq(rq)) Maybe the actual cause of the fault is some different instruction and the instruction pointer is not to be trusted (?). I suppose if I were in this situation, I'd sprinkle block/elevator.c::elv_completed_request with printk calls to be able to witness exactly what happens. Sorry for the trouble, and hope that helps. Jonathan -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110705225129.GA8701@elie