Hi Peter, On 28/03/18 09:21, Peter Xu wrote: > On Wed, Mar 28, 2018 at 08:49:59AM +0200, Auger Eric wrote: >> Hi Peter, >> >> On 28/03/18 04:03, Peter Xu wrote: >>> On Fri, Mar 23, 2018 at 01:36:36PM +0100, Auger Eric wrote: >>>> Hi, >>>> >>>> On 23/03/18 13:11, Peter Maydell wrote: >>>>> On 23 March 2018 at 12:01, Auger Eric <eric.au...@redhat.com> wrote: >>>>>> Hi, >>>>>> >>>>>> On 23/03/18 11:26, Peter Maydell wrote: >>>>>>> On 23 March 2018 at 10:24, Auger Eric <eric.au...@redhat.com> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I observe a regression on KVM accelerated qemu-system-aarch64: >>>>>>>> >>>>>>>> Unexpected error in kvm_device_access() at >>>>>>>> /home/augere/UPSTREAM/qemu/accel/kvm/kvm-all.c:2164: >>>>>>>> 2018-03-23T09:59:59.629439Z qemu-system-aarch64: KVM_GET_DEVICE_ATTR >>>>>>>> failed: Group 6 attr 0x000000000000c664: Device or resource busy >>>>>>>> 2018-03-23 10:00:00.085+0000: shutting down, reason=crashed >>>>>>> >>>>>>> Can you get a backtrace for this? (I guess you'd need to fiddle >>>>>>> with the kvm_device_access() code to make it assert rather >>>>>>> than passing back the error). >>>>>> >>>>>> OK. I will try to do so. As I could have expected, I cannot reproduce on >>>>>> a standalone qemu command line. The problem observed above is seen with >>>>>> libvirt launch which may be doing some other QMP stuff concurrently? >>>>> >>>>> Hmm, that could be a bit painful to debug. I dunno if libvirt >>>>> has a "launch QEMU under gdb" option. If not, you could try >>>>> something like: >>>>> if (condition we want to get a backtrace on) { >>>>> printf("hit condition, attach gdb to process %d\n", (int)getpid()); >>>>> for (;;) { } >>>>> } >>>> >>>> Thanks for the hint. Here is the stack I get. >>>> >>>> #0 kvm_device_access (fd=31, group=6, attr=50788, val=0x5937c88, >>>> write=false, errp=0x16984a8 <error_abort>) at >>>> /home/augere/UPSTREAM/qemu/accel/kvm/kvm-all.c:2164 >>>> #1 0x00000000004f8ce4 in arm_gicv3_icc_reset (env=0xffffa1fc8330, >>>> ri=0x597f910) at /home/augere/UPSTREAM/qemu/hw/intc/arm_gicv3_kvm.c:632 >>>> #2 0x00000000006351ac in cp_reg_reset (key=0x597f730, value=0x597f910, >>>> opaque=0xffffa1fc0010) at /home/augere/UPSTREAM/qemu/target/arm/cpu.c:78 >>>> #3 0x0000ffffa47edce4 in g_hash_table_foreach () from >>>> /lib64/libglib-2.0.so.0 >>>> #4 0x0000000000635394 in arm_cpu_reset (s=0xffffa1fc0010) at >>>> /home/augere/UPSTREAM/qemu/target/arm/cpu.c:130 >>>> #5 0x000000000090c888 in cpu_reset (cpu=0xffffa1fc0010) at qom/cpu.c:249 >>>> #6 0x00000000005793d8 in do_cpu_reset (opaque=0xffffa1fc0010) at >>>> /home/augere/UPSTREAM/qemu/hw/arm/boot.c:665 >>>> #7 0x000000000073095c in qemu_devices_reset () at hw/core/reset.c:69 >>>> #8 0x00000000006976e0 in qemu_system_reset (reason=SHUTDOWN_CAUSE_NONE) >>>> at vl.c:1731 >>>> #9 0x000000000069fd60 in main (argc=69, argv=0xffffe877d1a8, >>>> envp=0xffffe877d3d8) at vl.c:4697 >>> >>> I think current master should work fine with ARM KVM now since OOB is >>> now off by default. >> >> Yes it works for me with the reverts. >> >> But does ARM use postcopy, and will ARM need the >>> coming network failure recovery feature? >> >> I assume it does >>> >>> If so, maybe we'll still need to have a look on this single problem >>> (this is the only non-testcase issue I know now with Out-Of-Band). >> >> OK. I need to have a look at your series to better understand what it does. > > It introduced a dedicated iothread to run IO part of the monitor code > (e.g., parsing of QMP input, and reply of the responses). So now the > parsing could start earlier, before the main loop (that's what I > suspect the problem before), and meanwhile QMP IOs can happen in > parallel now with main thread, which it never can before (since all > QMP logic will be in main thread). > > I would be curious about what commands have been sent by libvirt to > QEMU-arm when reach this point. Please feel free to let me know if I > can help in any form. For example, if there is an ARM server that I > can login and run both libvirt and QEMU, I'd be glad to play with it > too. Yes, I will send you the data and will my utmost to help in the debugging.
Thanks Eric > > Thanks, >