"Marco Cavenati" <marco.caven...@eurecom.fr> writes:

> On Tuesday, April 15, 2025 15:50 CEST, Fabiano Rosas <faro...@suse.de> wrote:
>
>> > So, I see your point on performance, but I'm not fully comfortable
>> > with the difference in zero page handling between mapped-ram on
>> > and mapped-ram off. In the former case zero pages are skipped, while
>> > in the latter they are explicitly zeroed.
>> > Enabling mapped-ram has the implicit effect to also skip zero-pages.
>> > I think it is an optimization not really bound to mapped-ram and it
>> > could be better to have this feature external to mapped-ram, enabled
>> > when the destination ram is known to be already zeroed (also for
>> > mapped-ram off ideally).
>> 
>> From a design perspective that makes sense. The only use case currently
>> would be mapped-ram _migration_ (as opposed to snapshot) because
>> non-mapped-ram migration is usually done live. The stream of pages will
>> potentially have several versions of the same page, therefore we need to
>> clear it when it's a zero page.
>
> It might be a niche use case (and maybe I'm the only one still using
> plain old QEMU snapshots) but, also the cli option `-loadvm file` might
> benefit from skipping 0 pages: snapshots are taken with the VM stopped
> and the `-loadvm file` acts on a clean ram.
>

Indeed, I had forgotten about that usage. So that's a case where
mapped-ram does the right thing. Now I can forget about it again =)

For enabling mapped-ram generically at this point I think we don't have
much choice but to introduce a new version of read_ramblock_mapped_ram()
that writes zero pages. It would need to discriminate the zero pages,
(instead of doing a big copy of a bunch of pages) so we could avoid
copying a page that's already zero. In fact, if we do that, it would
already work without further changes. The performance of -loadvm would
leave something to be improved, but we could tackle that later.

What do you think?

If we decide we need to explicitly select that new code, I don't think
we need any new capability, because the user has no choice in it. We
know that loadvm needs it, but -loadvm doesn't, any other sort of
mapped-ram migration also doesn't need it. There is some discussion to
be had on whether we want to bind it to the commands themselves, or do
some form of detection of clean ram. I think it's best we postopone this
part of the discussion until Peter is back (next week), so we can have
more eyes on it.

>> We'd benefit from a separate "don't load zero pages" option only when
>> the VM is guaranteed to be stopped and more importantly, not allowed to
>> be unstopped. This is the tricky part. We have experimented with and
>> dropped the idea of detecting a stopped-vm-migration for mapped-ram (the
>> idea back then was not to do better zero page handling, but skip dirty
>> tracking altogether).
>> 
>> Since we're dealing with snapshots here, which are asynchronous, I'm
>> wondering wheter it would make sense to extend the zero-page-detection
>> option to the destination. Then we could bind the loadvm process to
>> zero-page-detection=none because I don't think we can safely ignore them
>> anyway.
>
>> > My overall goal is a hot-loadvm feature that currently lives in a fork
>> > downstream and has a long way before getting in a mergeable state :)
>> 
>> Cool. Don't hesitate to send stuff our way, the sooner and more often we
>> discuss this, the better the chances of getting it merged upstream.
>> 
>> Do you use libvirt at all? Mapped-ram support has been added to it in
>> the latest version. The original use-case for mapped-ram was quick
>> snapshot saving and loading after all. Libvirt does it in a way that
>> does not use savevm, which might be helpful.
>
> No, I don't use libvirt. Thanks for the hint, but in that case I guess
> libvirt would spawn a new "QEMU -incoming"  for each restore, and
> that would be too expensive.
>
>> > In a nutshell, I'm using dirty page tracking to load from the snapshot
>> > only the pages that have been dirtied between two loadvm;
>> > mapped-ram is required to seek and read only the dirtied pages.
>> 
>> But then you'd have a bitmap of pages you could correlate with the
>> file_bmap and force-load whatever pages you need. It doesn't seem like
>> zero page handling would affect you too much in that case.
>
> Perhaps I'm missing your point; if a page was zero in the snapshot
> and then gets dirtied, I need to zero it out.

I used "zero page handling" as "a policy on how to deal with zero
pages", similarly to the zero-page-detection option.

The page is being zeroed because it was dirty (on your dirty bitmap) not
because some policy determined that it needs to be zeroed. IOW, it seems
your solution doesn't need to have any knowledge of whether the whole
memory is supposed to be zeroed (as discussed above), the presence of a
dirty bitmap already selects which pages are restored and if there's a 1
for that page in the bitmap, then we don't really care what's in the
file_bmap.

If we implement generic snapshot + mapped-ram + zero page then you could
probably simply OR your dirty bitmap on top of the file_bmap.

Does that makes sense? I might be missing something, I'm looking at a
bunch of threads before going on vacation.

> But yeah, the code will be
> different and for my specific use case I don't absolutely need
> mapped-ram snapshots restore to fully work.
>
> Thank you.
> Best,
> Marco

Reply via email to