Re: [PATCH RFC 00/10] qapi: remove all TARGET_* conditionals from the schema

Daniel P . Berrangé Fri, 09 May 2025 02:03:27 -0700

On Thu, May 08, 2025 at 02:09:37PM -0700, Pierrick Bouvier wrote:
> On 5/8/25 6:58 AM, Daniel P. Berrangé wrote:
> > Pierrick has proposed a series that introduces a concept of runtime
> > conditionals to the QAPI schema, in order to adapt the TARGET_*
> > conditionals currently used at build time:
> > 
> >    https://lists.nongnu.org/archive/html/qemu-devel/2025-05/msg01699.html
> > 
> > For the sake of comparison & evaluation, this series illustrates the
> > alternative approach that we've discussed of entirely removing any
> > concept of TARGET_* conditionals.
> > 
> > With this the QAPI schema varies solely based on CONFIG_* conditionals,
> > and is thus invariant across different target emulators.
> > 
> > In this PoC I've taken the minimal effort approach to the problem.
> > 
> > The QAPI schema has removed the TARGET_* conditionals and in order to
> > make all the emulators then compile, the stubs/ directory is populated
> > with a bunch of files to provide dummy impls of the target specific QMP
> > commands.
> > 
> > This is sufficient to make the current QEMU binaries build successfully.
> > 
> > To make the "single binary" concept work, however, would require
> > additional followup work to eliminate the stubs.
> > 
> > Instead of having stubs we would need to de-couple the QMP command
> > impl from the machine internals. This would likely require greater
> > use of interfaces and/or virtual method dispatchers on the machine
> > class. This would enable the 'qmp_XXXXX' command impls to exist
> > once. Then they call out to virtual methods on the machine to provide
> > the real impl, and/or generate an error if the virtual method is not
> > implemented for the machine.
> > 
> 
> Thanks for posting it Daniel.
> 
> I think your approach is pretty neat, and yes, it's much simpler than having
> any compile time or runtime conditional to deal with that.
> 
> When we talked about that on previous thread, I thought the idea was to
> expose *all* the commands to *all* the targets, which I didn't really
> understand, considering we have target specific commands by design.
> I understand better where you wanted to go, by extracting concerned commands
> in dedicated files.
> 
> The only downside I can see is that some commands have to be there, but
> return an "error, not implemented" at runtime. Fine for me, but some people
> may argue against it.
> 
> A concern I might have as well is about how we'll deal if we want to hide
> some commands in the future, based on various criterias
> (is_heterogenenous()?). The mantra "define all, and let the build system
> hide things" mantra means you can only have a single definition existing in
> the binary, by design. But maybe it's not even a real concern, and I
> definitely prefer to see problems before fixing them, so it's definitely not
> blocking this approach.


I think we have to distinguish between what made sense in the context
of our original design, from what makes sense for our future design(s)
and plans.

I can understand why we took the direction we did historically, but I
think we have unwittingly created problems for ourselves that are
looking increasingly worse than the problems we considered solved.


In the other thread I pointed out my dislike for QAPI schema not being
fully self-describing when we have conditionals present, even today,
but there are other aspects that trouble me more wrt conditionals.

Since the schema is currently target specific, a mgmt app has to probe
for QEMU capabilities once for every target binary. QEMU has ~30 system
binaries, and if querying capabilities takes 250 milliseconds, then
querying all binaries is ~ 7 seconds of work. Libvirt mitigates this
overhead by caching the results of capabilities probes. We periodically
suffer delays when we must invalidate our cache though, typically on
software upgrades, and this is unpleasant for users who think we've
got stuck or just have bad slow code.


Even if we had a QAPI schema that didn't vary per target, this is
repeated probing is tricky to avoid when we have completely independant
binaries. We would need QEMU to have some internal "build id", so that
we could detect that all binaries came from the same build, to let us
avoid re-probing each binary.


With your work to have a single binary that can impersonate any of our
target specific binaries, *if* we had a QAPI schema that didn't vary
per target, we would stand a real chance of being able to eliminate
the need to probe capabilities once for every target.

eg if we assume qemu-system-x86_64 is a symlink to "/usr/bin/qemu-system"
(or whatever you call the common binary), then we only need to probe one
binary - the target of the symlink - we can easily identify that all
the legacy binary symlinks resolve to the same real binary with no special
effort.


Conversely if we want the QAPI schema to be varying at runtime per target,
it means we have to select a target before we can probe the schema in the
binary. If we assume an install with only the new binary and none of the
legacy target specific symlinks, we have a chicken & egg problem though.

How do we know what targets are supported ? We should be able to use QAPI
to query the available targets, but before we can use QAPI we need to
select a target ?

We could make QAPI only expose the pieces of schema that are *not*
specific to a target initially, and then refine the schema once a
target is selected. That would solve the immediate problem, but I
think that kind of cleverness is just postponing the pain

Consider that we eventually want a QEMU binary that can run CPUs from
multiple targets concurrently.  That implies that a dynamic QAPI schema
needs to turn on the target specific pieces of multiple targets at once.
At that point a mgmt app fundamentally has to understand ahead of time
that some features are target specific - the existance/absence of
features in the QAPI schema can no longer express what is target
specific, since we must expose the union of features all targets
that are in use.

So as we look to the future of QEMU, IMHO, the concept of using the QAPI
schema definitions to dynamically express what is available per-target
is unhelpful and likely even to be harmful.

Assuming a schema that is fixed at build time ought to simplify QEMU
maintainers' life, and while mgmt apps loose a little info, they will
gain in other ways such as not having to query QEMU capabilities
30 times over.

> Overall, if I had to choose, I think I would pick your approach instead of
> the "runtime conditional" route, it's just simpler and saner overall.
> I would be happy to see you take ownership on that, as long as we can
> deliver something in the next weeks. I can help on anything as well.

I'm fine with you driving this forward, not least because I can't commit
to having any predictable amount of time to write code myself for this.

> @Markus, @Michael, which approach would you prefer to be followed for QAPI?

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH RFC 00/10] qapi: remove all TARGET_* conditionals from the schema

Reply via email to