On Sat, 25 May 2019 03:35:20 +0000
"liujunjie (A)" <liujunji...@huawei.com> wrote:

> Hi, I have met a problem:
> 
> The QEMU version is 2.8.1, the virtual machine is configured with 1G huge 
> pages, two NUMA nodes and four pass-through NVME SSDs.
> 
> After we started the VM, in addition to some QMP queries nothing more has 
> been done, the QEMU aborted after some months later.
> After that, the VM is restarted, and the problem does not reproduce yet.
> And The backtrace of the RCU thread is as follows:
> (gdb) bt
> #0  0x00007fd2695f0197 in raise () from /usr/lib64/libc.so.6
> #1  0x00007fd2695f1888 in abort () from /usr/lib64/libc.so.6
> #2  0x00007fd2695e9206 in __assert_fail_base () from /usr/lib64/libc.so.6
> #3  0x00007fd2695e92b2 in __assert_fail () from /usr/lib64/libc.so.6
> #4  0x0000000000476a84 in memory_region_finalize (obj=<optimized out>)
>     at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:1512
> #5  0x0000000000763105 in object_deinit (obj=obj@entry=0x1dc1fd0,
>     type=type@entry=0x1d065b0) at qom/object.c:448
> #6  0x0000000000763153 in object_finalize (data=0x1dc1fd0) at qom/object.c:462
> #7  0x00000000007627cc in object_property_del_all (obj=obj@entry=0x1dc1f70)
>     at qom/object.c:399
> #8  0x0000000000763148 in object_finalize (data=0x1dc1f70) at qom/object.c:461
> #9  0x0000000000764426 in object_unref (obj=<optimized out>) at 
> qom/object.c:897
> #10 0x0000000000473b6b in memory_region_unref (mr=<optimized out>)
>     at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:1560
> #11 0x0000000000473bc7 in flatview_destroy (view=0x7fc188b9cb90)
>     at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:289
> #12 0x0000000000843be0 in call_rcu_thread (opaque=<optimized out>)
>     at util/rcu.c:279
> #13 0x00000000008325c2 in qemu_thread_start (args=args@entry=0x1d00810)
>     at util/qemu_thread_posix.c:496
> #14 0x00007fd269983dc5 in start_thread () from /usr/lib64/libpthread.so.0
> #15 0x00007fd2696b27bd in clone () from /usr/lib64/libc.so.6
> 
> In this core, I found that the reference of "/objects/ram-node0"( the type of 
> ram-node0 is struct "HostMemoryBackendFile") equals to 0 , while the 
> reference of "/objects/ram-node1" equals to 129, more details can be seen at 
> the end of this email.
> 
> I searched through the community, and found a case that had the same error 
> report: https://mail.coreboot.org/pipermail/seabios/2017-September/011799.html
> However, I did not configure pcie_pci_bridge. Besides, qemu aborted in device 
> initialization phase in this case.
That case doesn't seem relevant.

> 
> Also, I try to find out which can reference "/objects/ram-node0" so as to 
> look for the one that may un reference improperly, most of them lie in the 
> function of "render_memory_region" or "phys_section_add" when memory topology 
> changes.
> Later, the temporary flatviews are destroyed by RCU thread, so un reference 
> happened and the backtrace is similar to the one shown above.
> But I am not familiar with the detail of these process, it is hard to keep 
> trace of these memory topology changes.
> 
> My question is:
> How can ram-node0's reference comes down to 0 when the virtual machine is 
> still running?
> 
> Maybe someone who is familiar with memory_region_ref or memory-backend-file 
> can help me figure out.
> Any idea is appreciated.

Could you provide steps to reproduce (incl. command line)? 

[...]
> Thanks,
> Junjie Liu
> 


Reply via email to