Heh, but that one said:

+\item[ VIRTIO_BALLOON_F_WS_REPORTING(6) ] The device has support for
Working Set

Which does not seem to reflect reality ...

Please feel free to disregard these features and reuse their bits and
queue indexes; as far as I know, they are not actually enabled
anywhere currently and the corresponding guest patches were only
applied to some (no-longer-used) ChromeOS kernel trees, so the
compatibility impact should be minimal. I will also try to clean up
the leftover bits on the crosvm side just to clear things up.

Thanks for your reply, and thanks for clarifying+cleaning it up.


I dug a bit more into cross-vm, because that one seems to be the only
one out there that does not behave like everybody else I found (maybe good,
maybe bad :) ).


1) There was temporarily even another feature (VIRTIO_BALLOON_F_EVENTS_VQ)
and another queue.

It got removed from cross-vm in:

commit 9ba634b82b55ba762dc8724676b2cf9419460145
Author: Daniel Verkamp <dverk...@chromium.org>
Date:   Thu Jul 11 11:29:52 2024 -0700

      devices: virtio-balloon: remove event queue support

      VIRTIO_BALLOON_F_EVENTS_VQ was part of a proposed virtio spec change.

      It is not currently supported by upstream Linux, so removing this should
      have no effect except for guest kernels that had CHROMIUM patches
      applied.

      The virtqueue indexes for the ws-related queues are decremented to fill
      the hole left by the removal of the event VQ; these are non-standard as
      well, so they do not have virtqueue indexes assigned in the virtio spec,
      but the proposed spec extension did actually use vq indexes 5 and 6.

      BUG=b:214864326


2) cross-vm is aware of the upstream Linux driver

They thought your fix would go upstream; it didn't.

commit a2fa119e759d0238a42ff15a9aff0dfd122afebd
Author: Daniel Verkamp <dverk...@chromium.org>
Date:   Wed Jul 10 16:16:28 2024 -0700

      devices: virtio-balloon: warn about queue index mismatches

      The Linux kernel virtio-balloon driver spec non-compliance related to
      queue numbering is being fixed; add some diagnostics to our device that
      help to check if everything is working as expected.

      
<https://lore.kernel.org/virtualization/cacgkmesg0+vpav1fo8jf1isq4ef8t4_cfn1scyztdo8bxzr...@mail.gmail.com/T/>

      Additionally, replace the num_expected_queues() function with per-queue
      checking to avoid the need for the duplicate feature checks and queue
      count calculation; each pop_queue() call will be checked using the `?`
      operator and return a more useful error message if a particular queue is
      missing.

      BUG=None
      TEST=crosvm run --balloon-page-reporting ...


IIRC, in that commit they switched to the "spec" behavior.

That's when they started hard-coding the queue indexes.

CCing Daniel. All Linux versions should be incompatible with cross-vmm 
regarding free page reporting.
How is that handled?

In practice, it only works because nobody calls crosvm with
--balloon-page-reporting (it's off by default), so the balloon device
does not advertise the VIRTIO_BALLOON_F_PAGE_REPORTING feature.

(I just went searching now, and it does seem like there is actually
one user in Android that does try to enable page reporting[1], which
I'll have to look into...)

In my opinion, it makes the most sense to keep the spec as it is and
change QEMU and the kernel to match, but obviously that's not trivial
to do in a way that doesn't break existing devices and drivers.

If only it would be limited to QEMU and Linux ... :)

Out of curiosity, assuming we'd make the spec match the current QEMU/Linux implementation at least for the 3 involved features only, would there be a way to adjust crossvm without any disruption?

I still have the feeling that it will be rather hard to get that all implementations match the spec ... For new features+queues it will be easy to force the usage of fixed virtqueue numbers, but for free-page-hinting and reporting, it's a mess :(

--
Cheers,

David / dhildenb


Reply via email to