On Thu, Apr 28, 2011 at 11:00 AM, Jason Herring <jaherr...@usa.net> wrote:
> More fun. I NFS mounted the VM from a Linux box and went to start the VM - > this also panicked the *opensolaris* kernel! Maybe this is something to do > with the network stack? Though I see a lot of references to sata/scsi and > si3124 (sata controller card in machine) in the panic: > > Apr 28 08:52:54 atlantis unix: [ID 836849 kern.notice] > Apr 28 08:52:54 atlantis ^Mpanic[cpu0]/thread=ffffff00064cbc60: > Apr 28 08:52:54 atlantis genunix: [ID 103648 kern.notice] recursive > mutex_enter, lp=ffffff019e1fc1e8 owner=ffffff00064cbc60 > thread=ffffff00064cbc60 > Apr 28 08:52:54 atlantis unix: [ID 100000 kern.notice] > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb560 > unix:mutex_panic+73 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb5c0 > unix:mutex_vector_enter+190 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb690 > si3124:si_mop_commands+6e () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb700 > si3124:si_reject_all_reset_pkts+7c () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb760 > si3124:si_tran_reset_dport+9b () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb7d0 > sata:sata_scsi_reset+ab () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb800 > scsi:scsi_reset+52 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb870 > sd:sd_sense_key_medium_or_hardware_error+fb () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb8d0 > sd:sd_decode_sense+e5 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb930 > sd:sd_handle_auto_request_sense+100 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb980 > sd:sdintr+145 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb9b0 > scsi:scsi_hba_pkt_comp+15c () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cba00 > sata:sata_txlt_rw_completion+1d3 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbad0 > si3124:si_mop_commands+401 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbb40 > si3124:si_intr_command_error+f7 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbbb0 > si3124:si_intr+227 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc00 > unix:av_dispatch_autovect+7c () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc40 > unix:dispatch_hardint+33 () > Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff0006405aa0 > unix:switch_sp_and_call+13 () > Apr 28 08:52:54 atlantis unix: [ID 100000 kern.notice] > Apr 28 08:52:54 atlantis genunix: [ID 672855 kern.notice] syncing file > systems... > Apr 28 08:52:54 atlantis genunix: [ID 904073 kern.notice] done > Apr 28 08:52:55 atlantis genunix: [ID 111219 kern.notice] dumping to > /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel > Apr 28 08:53:03 atlantis genunix: [ID 100000 kern.notice] > Apr 28 08:53:03 atlantis genunix: [ID 665016 kern.notice] ^M100% done: > 129259 pages dumped, > Apr 28 08:53:03 atlantis genunix: [ID 851671 kern.notice] dump succeeded > > I need to get this VM up and running - any thoughts? I might have to go to > Linux for this server if I can't get this figured out and I'd rather not do > that. > > Based on the stack trace, it appears the adapter is encountering an error from the HBA, which it is handling incorrectly (rather the interrupt code locks something, which then calls code that tries to lock the same thing again, causing the panic). This is probably one of: 6358757, 6957964, or 6959541 (since bugs.opensolaris.org is down, I cannot tell from the synopsis which one it is). b145 or later appears to fix the issue (based on inspecting the source for the driver). I don't know of any immediate workarounds. You could try booting the latest openindiana iso or whatever Oracle is calling the latest Solaris 11 preview to get at your data, as they should have fixed drivers. You _might_ if really desparate try booting + copying the si3124 driver from them over (but there is no guarantee it would work -- sometimes you get lucky, sometimes you don't. I would suggest keeping a copy of the original handy).
_______________________________________________ opensolaris-code mailing list opensolaris-code@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/opensolaris-code