On Tue, May 13, 2025 at 07:21:36PM +0200, David Hildenbrand wrote: > On 12.05.25 17:16, Chaney, Ben wrote: > > Hello, > > > > When live migrating to a destination host with pmem there is a > > very long downtime where the guest is paused. In some cases, this can be as > > high as 5 minutes, compared to less than one second in the good case. > > > > > > Profiling suggests very high activity in this code path: > > > > > > ffffffffa2956de6 clean_cache_range+0x26 ([kernel.kallsyms]) > > ffffffffa2359b0f dax_writeback_mapping_range+0x1ef ([kernel.kallsyms]) > > ffffffffc0c6336d ext4_dax_writepages+0x7d ([kernel.kallsyms]) > > ffffffffa2242dac do_writepages+0xbc ([kernel.kallsyms]) > > ffffffffa2235ea6 filemap_fdatawrite_wbc+0x66 ([kernel.kallsyms]) > > ffffffffa223a896 __filemap_fdatawrite_range+0x46 ([kernel.kallsyms]) > > ffffffffa223af73 file_write_and_wait_range+0x43 ([kernel.kallsyms]) > > ffffffffc0c57ecb ext4_sync_file+0xfb ([kernel.kallsyms]) > > ffffffffa228a331 __do_sys_msync+0x1c1 ([kernel.kallsyms]) > > ffffffffa2997fe6 do_syscall_64+0x56 ([kernel.kallsyms]) > > ffffffffa2a00126 entry_SYSCALL_64_after_hwframe+0x6e ([kernel.kallsyms]) > > 11ec5f msync+0x4f (/usr/lib/x86_64-linux-gnu/libc.so.6) > > 675ada qemu_ram_msync+0x8a (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > > 6873c7 xbzrle_load_cleanup+0x37 (inlined) > > 6873c7 ram_load_cleanup+0x37 (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > > 4ff375 qemu_loadvm_state_cleanup+0x55 > > (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > > 500f0b qemu_loadvm_state+0x15b > > (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > > 4ecf85 process_incoming_migration_co+0x95 > > (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > > 8b6412 qemu_coroutine_self+0x2 > > (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > > ffffffffffffffff [unknown] ([unknown]) > > > > > > I was able to resolve the performance issue by removing the call > > to qemu_ram_block_writeback in ram_load_cleanup. This causes the > > performance to return to normal. It looks like this code path was initially > > added to ensure the memory was synchronized if the persistent memory region > > is backed by an NVDIMM device. Does it serve any purpose if pmem is instead > > backed by standard DRAM? > > Are you using a read-only NVDIMM? > > In that case, I assume we would never need msync. > > > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h > index 94bb3ccbe4..819b8ef829 100644 > --- a/include/exec/ram_addr.h > +++ b/include/exec/ram_addr.h > @@ -153,7 +153,8 @@ void qemu_ram_msync(RAMBlock *block, ram_addr_t start, > ram_addr_t length); > /* Clear whole block of mem */ > static inline void qemu_ram_block_writeback(RAMBlock *block) > { > - qemu_ram_msync(block, 0, block->used_length); > + if (!(block->flags & RAM_READONLY)) > + qemu_ram_msync(block, 0, block->used_length); > } > > > -- > Cheers, > > David / dhildenb
I acked the original change but now I don't understand why is it critical to preserve memory at a random time that has nothing to do with guest state. David, maybe you understand?