On Tue, May 13, 2025 at 07:21:36PM +0200, David Hildenbrand wrote:
> On 12.05.25 17:16, Chaney, Ben wrote:
> > Hello,
> > 
> >          When live migrating to a destination host with pmem there is a 
> > very long downtime where the guest is paused. In some cases, this can be as 
> > high as 5 minutes, compared to less than one second in the good case.
> > 
> > 
> >          Profiling suggests very high activity in this code path:
> > 
> > 
> > ffffffffa2956de6 clean_cache_range+0x26 ([kernel.kallsyms])
> > ffffffffa2359b0f dax_writeback_mapping_range+0x1ef ([kernel.kallsyms])
> > ffffffffc0c6336d ext4_dax_writepages+0x7d ([kernel.kallsyms])
> > ffffffffa2242dac do_writepages+0xbc ([kernel.kallsyms])
> > ffffffffa2235ea6 filemap_fdatawrite_wbc+0x66 ([kernel.kallsyms])
> > ffffffffa223a896 __filemap_fdatawrite_range+0x46 ([kernel.kallsyms])
> > ffffffffa223af73 file_write_and_wait_range+0x43 ([kernel.kallsyms])
> > ffffffffc0c57ecb ext4_sync_file+0xfb ([kernel.kallsyms])
> > ffffffffa228a331 __do_sys_msync+0x1c1 ([kernel.kallsyms])
> > ffffffffa2997fe6 do_syscall_64+0x56 ([kernel.kallsyms])
> > ffffffffa2a00126 entry_SYSCALL_64_after_hwframe+0x6e ([kernel.kallsyms])
> > 11ec5f msync+0x4f (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > 675ada qemu_ram_msync+0x8a (/usr/local/akamai/qemu/bin/qemu-system-x86_64)
> > 6873c7 xbzrle_load_cleanup+0x37 (inlined)
> > 6873c7 ram_load_cleanup+0x37 (/usr/local/akamai/qemu/bin/qemu-system-x86_64)
> > 4ff375 qemu_loadvm_state_cleanup+0x55 
> > (/usr/local/akamai/qemu/bin/qemu-system-x86_64)
> > 500f0b qemu_loadvm_state+0x15b 
> > (/usr/local/akamai/qemu/bin/qemu-system-x86_64)
> > 4ecf85 process_incoming_migration_co+0x95 
> > (/usr/local/akamai/qemu/bin/qemu-system-x86_64)
> > 8b6412 qemu_coroutine_self+0x2 
> > (/usr/local/akamai/qemu/bin/qemu-system-x86_64)
> > ffffffffffffffff [unknown] ([unknown])
> > 
> > 
> >          I was able to resolve the performance issue by removing the call 
> > to qemu_ram_block_writeback in ram_load_cleanup. This causes the 
> > performance to return to normal. It looks like this code path was initially 
> > added to ensure the memory was synchronized if the persistent memory region 
> > is backed by an NVDIMM device. Does it serve any purpose if pmem is instead 
> > backed by standard DRAM?
> 
> Are you using a read-only NVDIMM?
> 
> In that case, I assume we would never need msync.
> 
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 94bb3ccbe4..819b8ef829 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -153,7 +153,8 @@ void qemu_ram_msync(RAMBlock *block, ram_addr_t start, 
> ram_addr_t length);
>  /* Clear whole block of mem */
>  static inline void qemu_ram_block_writeback(RAMBlock *block)
>  {
> -    qemu_ram_msync(block, 0, block->used_length);
> +    if (!(block->flags & RAM_READONLY))
> +        qemu_ram_msync(block, 0, block->used_length);
>  }
> 
> 
> -- 
> Cheers,
> 
> David / dhildenb

I acked the original change but now I don't understand why is it
critical to preserve memory at a random time that has nothing
to do with guest state.
David, maybe you understand?


Reply via email to