>On Tue, Aug 02, 2016 at 04:18:30AM +0000, Xulei (Stone) wrote: >> >On Fri, Jul 29, 2016 at 04:04:59AM +0000, Xulei (Stone) wrote: >> >> After one day, the vm is stuck. Looking from the following seabios >> >> log, it seems seabios stops at "PCI: Using 00:02.0 for primary >> >> VGA", and can not execute handle_smp() any more. >> >> What may be the reason? >> > >> >More debugging info would be necessary to find this problem. You >> >could try reproducing and attaching gdb ( >> >http://www.seabios.org/Debugging#Debugging_with_gdb_on_QEMU ). >> >Alternatively, a kvm trace log may help. >> > >> kvm trace (seems useful) indicates that cpu 0 keeps always to access 0x00b3 ioport. >> 0x00b3 is PORT_SMI_STATUS, so i guess my bios is stuck in the >> function smm_relocate_and_restore { >> ... >> /* wait until SMM code executed */ >> while (inb(PORT_SMI_STATUS) != 0x00) >> ... >> } > >I'd try adding dprintf() statements around all the code at the top of >smm_relocate_and_restore() and enable the dprintf() at the top of >handle_smi(). > >It would also be useful if you can extract the log from the last two >working reboots to compare it to the failed case.
Following your suggestion, i'm now sure it is caused by missing SMI. I have tried adding dprintf() like this: --- a/roms/seabios/src/fw/smm.c +++ b/roms/seabios/src/fw/smm.c @@ -65,7 +65,8 @@ handle_smi(u16 cs) u8 cmd = inb(PORT_SMI_CMD); struct smm_layout *smm = MAKE_FLATPTR(cs, 0); u32 rev = smm->cpu.i32.smm_rev & SMM_REV_MASK; - dprintf(DEBUG_HDL_smi, "handle_smi cmd=%x smbase=%p\n", cmd, smm); + if(cmd == 0x00) { + dprintf(1, "handle_smi cmd=%x smbase=%p\n", cmd, smm); + } if (smm == (void*)BUILD_SMM_INIT_ADDR) { // relocate SMBASE to 0xa0000 @@ -147,14 +148,14 @@ smm_relocate_and_restore(void) { /* init APM status port */ outb(0x01, PORT_SMI_STATUS); + dprintf(1,"before SMI====\n"); /* raise an SMI interrupt */ outb(0x00, PORT_SMI_CMD); + dprintf(1,"after SMI=====\n"); /* wait until SMM code executed */ while (inb(PORT_SMI_STATUS) != 0x00) ; + dprintf(1,"smm code executes complete====\n"); And the failed case log output like this: 2016-08-03 16:23:15PCI: Using 00:02.0 for primary VGA 2016-08-03 16:23:15smm_device_setup start 2016-08-03 16:23:15init smm 2016-08-03 16:23:15before SMI==== 2016-08-03 16:23:15after SMI===== So, it's obviously that after outb(0x01, PORT_SMI_STATUS), bios does not handle_smi, so PORT_SMI_STATUS is always 0x01. What's more, when this problem happens, rebooting vm cannot restore it any more. My vm is always stuck at the same place until i destroy it. And I have already tried kernel commit c43203cab1e which still can not solve this problem. Any idea, Kevin and Paolo? > > > >-Kevin