Hi, I have met a problem: The QEMU version is 2.8.1, the virtual machine is configured with 1G huge pages, two NUMA nodes and four pass-through NVME SSDs.
After we started the VM, in addition to some QMP queries nothing more has been done, the QEMU aborted after some months later. After that, the VM is restarted, and the problem does not reproduce yet. And The backtrace of the RCU thread is as follows: (gdb) bt #0 0x00007fd2695f0197 in raise () from /usr/lib64/libc.so.6 #1 0x00007fd2695f1888 in abort () from /usr/lib64/libc.so.6 #2 0x00007fd2695e9206 in __assert_fail_base () from /usr/lib64/libc.so.6 #3 0x00007fd2695e92b2 in __assert_fail () from /usr/lib64/libc.so.6 #4 0x0000000000476a84 in memory_region_finalize (obj=<optimized out>) at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:1512 #5 0x0000000000763105 in object_deinit (obj=obj@entry=0x1dc1fd0, type=type@entry=0x1d065b0) at qom/object.c:448 #6 0x0000000000763153 in object_finalize (data=0x1dc1fd0) at qom/object.c:462 #7 0x00000000007627cc in object_property_del_all (obj=obj@entry=0x1dc1f70) at qom/object.c:399 #8 0x0000000000763148 in object_finalize (data=0x1dc1f70) at qom/object.c:461 #9 0x0000000000764426 in object_unref (obj=<optimized out>) at qom/object.c:897 #10 0x0000000000473b6b in memory_region_unref (mr=<optimized out>) at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:1560 #11 0x0000000000473bc7 in flatview_destroy (view=0x7fc188b9cb90) at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:289 #12 0x0000000000843be0 in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:279 #13 0x00000000008325c2 in qemu_thread_start (args=args@entry=0x1d00810) at util/qemu_thread_posix.c:496 #14 0x00007fd269983dc5 in start_thread () from /usr/lib64/libpthread.so.0 #15 0x00007fd2696b27bd in clone () from /usr/lib64/libc.so.6 In this core, I found that the reference of "/objects/ram-node0"( the type of ram-node0 is struct "HostMemoryBackendFile") equals to 0 , while the reference of "/objects/ram-node1" equals to 129, more details can be seen at the end of this email. I searched through the community, and found a case that had the same error report: https://mail.coreboot.org/pipermail/seabios/2017-September/011799.html However, I did not configure pcie_pci_bridge. Besides, qemu aborted in device initialization phase in this case. Also, I try to find out which can reference "/objects/ram-node0" so as to look for the one that may un reference improperly, most of them lie in the function of "render_memory_region" or "phys_section_add" when memory topology changes. Later, the temporary flatviews are destroyed by RCU thread, so un reference happened and the backtrace is similar to the one shown above. But I am not familiar with the detail of these process, it is hard to keep trace of these memory topology changes. My question is: How can ram-node0's reference comes down to 0 when the virtual machine is still running? Maybe someone who is familiar with memory_region_ref or memory-backend-file can help me figure out. Any idea is appreciated. --- (gdb) p *((HostMemoryBackendFile *) 0x1dc1f70) $24 = {parent_obj = {parent = {class = 0x1d70880, free = 0x7fd26a812580 <g_free>, properties = 0x1db7920, ref = 0, parent = 0x1da9710}, size = 68719476736, merge = true, dump = false, prealloc = true, force_prealloc = false, is_mapped = true, host_nodes = {1, 0, 0}, policy = HOST_MEM_POLICY_BIND, mr = {parent_obj = {class = 0x1d6d790, free = 0x0, properties = 0x1db79e0, ref = 0, parent = 0x0}, romd_mode = true, ram = true, subpage = false, readonly = false, rom_device = false, flush_coalesced_mmio = false, global_locking = true, dirty_log_mask = 0 '\000', ram_block = 0x1dc2960, owner = 0x1dc1f70, iommu_ops = 0x0, ops = 0xcb0fe0 <unassigned_mem_ops>, opaque = 0x0, container = 0x200d4c0, size = 0x00000000000000000000001000000000, addr = 0, destructor = 0x470800 <memory_region_destructor_ram>, align = 1073741824, terminates = true, ram_device = false, enabled = true, warning_printed = false, vga_logging_count = 0 '\000', alias = 0x0, alias_offset = 0, priority = 0, subregions = {tqh_first = 0x0, tqh_last = 0x1dc2078}, subregions_link = {tqe_next = 0x0, tqe_prev = 0x1dc2c68}, coalesced = {tqh_first = 0x0, tqh_last = 0x1dc2098}, name = 0x1dc27a0 "/objects/ram-node0", ioeventfd_nb = 0, ioeventfds = 0x0, iommu_notify = {lh_first = 0x0}, iommu_notify_flags = IOMMU_NOTIFIER_NONE}}, share = true, mem_path = 0x1dc2350 "/dev/hugepages/libvirt/qemu/118-instance-00025bf8"} (gdb) p *((HostMemoryBackendFile *) 0x1dc2b50) $205 = {parent_obj = {parent = {class = 0x1d70880, free = 0x7fd26a812580 <g_free>, properties = 0x1db7a40, ref = 129, parent = 0x1da9710}, size = 68719476736, merge = true, dump = false, prealloc = true, force_prealloc = false, is_mapped = true, host_nodes = {2, 0, 0}, policy = HOST_MEM_POLICY_BIND, mr = {parent_obj = {class = 0x1d6d790, free = 0x0, properties = 0x1db7aa0, ref = 1, parent = 0x1dc2b50}, romd_mode = true, ram = true, subpage = false, readonly = false, rom_device = false, flush_coalesced_mmio = false, global_locking = true, dirty_log_mask = 0 '\000', ram_block = 0x1dc3470, owner = 0x1dc2b50, iommu_ops = 0x0, ops = 0xcb0fe0 <unassigned_mem_ops>, opaque = 0x0, container = 0x200d4c0, size = 0x00000000000000000000001000000000, addr = 68719476736, destructor = 0x470800 <memory_region_destructor_ram>, align = 1073741824, terminates = true, ram_device = false, enabled = true, warning_printed = false, vga_logging_count = 0 '\000', alias = 0x0, alias_offset = 0, priority = 0, subregions = {tqh_first = 0x0, tqh_last = 0x1dc2c58}, subregions_link = {tqe_next = 0x1dc1fd0, tqe_prev = 0x200d568}, coalesced = {tqh_first = 0x0, tqh_last = 0x1dc2c78}, name = 0x1dc32b0 "/objects/ram-node1", ioeventfd_nb = 0, ioeventfds = 0x0, iommu_notify = {lh_first = 0x0}, iommu_notify_flags = IOMMU_NOTIFIER_NONE}}, share = true, mem_path = 0x1da8c40 "/dev/hugepages/libvirt/qemu/118-instance-00025bf8"} Thanks, Junjie Liu