On 10.10.23 16:35, Alex Bennée wrote:
Hanna Czenczek <hre...@redhat.com> writes:

(adding Viresh to CC for Xen Vhost questions)

On 10.10.23 12:36, Alex Bennée wrote:
Hanna Czenczek <hre...@redhat.com> writes:

On 10.10.23 06:00, Yajun Wu wrote:
On 10/9/2023 5:13 PM, Hanna Czenczek wrote:
External email: Use caution opening links or attachments


On 09.10.23 11:07, Hanna Czenczek wrote:
On 09.10.23 10:21, Hanna Czenczek wrote:
On 07.10.23 04:22, Yajun Wu wrote:
[...]

<snip>
So as far as I understand, the feature is supposed to rely on
implementation-specific behavior between specifically qemu as a
front-end and dpdk as a back-end, nothing else.  Honestly, that to me
is a very good reason to deprecate it.  That would make it clear that
any implementation that implements it does so because it relies on
implementation-specific behavior from other implementations.

Option 2 is to fix it.  It is not right to use this broadly defined
feature with its clear protocol as given in the virtio specification
just to set and clear a single bit (DRIVER_OK).  The vhost-user
specification points to that virtio protocol.  We must adhere to the
protocol.  And note that we must not reset devices just because the VM
is paused/resumed.  (That is why I wanted to deprecate SET_STATUS, so
that Stefan’s series would introduce RESET_DEVICE where we need it,
and we can (for now) ignore the SET_STATUS 0 in vhost_dev_stop().)

Option 3 would be to just be honest in the specification, and limit
the scope of F_STATUS to say the only bit that matters is DRIVER_OK.
I would say this is not really different from deprecating, though it
wouldn’t affect your case.  However, I understand Alex relies on a
full status byte.  I’m still interested to know why that is.
For an F_TRANSPORT backend (or whatever the final name ends up being) we
need the backend to have full control of the status byte because all the
handling of VirtIO is deferred to it. Therefor it has to handle all the
feature negotiation and indicate when the device needs resetting.

(side note: feature negotiation is another slippery area when QEMU gets
involved in gating which feature bits may or may not be exposed to the
backend. The only one it should ever mask is F_UNUSED which is used
(sic) to trigger the vhost protocol negotiation)
That’s the thing, feature negotiation is done with GET_FEATURES and
SET_FEATURES.  Configuring F_REPLY_ACK lets SET_FEATURES return
errors.
OK but then what - QEMU fakes up FEATURES_OK in the Device Status field
on the behalf of the backend?

It does that right now.  When using qemu, vhost-user status byte is not exposed to the guest at all.  qemu makes it up completely, and effectively ignores the response from GET_STATUS completely.

(The only use of GET_STATUS is (right now): There is a function to set a flag in the status byte, and it calls GET_STATUS, ORs the flag in, and calls SET_STATUS with the result.)

I should point out QEMU doesn't exist in some of these use case. When
using the rust-vmm backends with Xen for example there is no VMM to talk
to so we have a Xen Vhost Frontend which is entirely concerned with
setup and then once connected up leaves the backend to do its thing. I'd
rather leave the frontend as dumb as possible rather than splitting
logic between the two.

Indicating that the device needs reset is a good point, there is no
other feature to do that.  (And something qemu currently ignores, just
like any value the device returns through GET_STATUS, but that’s
besides the point.)

Option 4 is of course not to do anything, and leave everything as-is,
waiting for the next person to stir the hornet’s nest.

Cc-ing Alex on this mail, because to me, this seems like an important
detail when he plans on using the byte in the future. If we need a
virtio status byte, I can’t see how we could use the existing F_STATUS
for it.
What would we use instead of F_STATUS to query the Device Status field?
We would emulate it in the front-end, just like we need to do for
back-ends without F_STATUS.  We can’t emulate the DEVICE_NEEDS_RESET
bit, though, that’s correct.

Given that qemu currently ignores DEVICE_NEEDS_RESET, I’m not 100 %
convinced that your use case has a hard dependency on F_STATUS.
However, this still does make a fair point in general that it would be
useful to keep it.
OK/

That still leaves us with the situation that currently, the only
implementations with F_STATUS support are qemu and dpdk, which both
handle it incorrectly.
I was going to say there is also the rust-vmm vhost-user-master crates
which we've imported:

   https://github.com/vireshk/vhost

for the Xen Vhost Frontend:

   https://github.com/vireshk/xen-vhost-frontend

but I can't actually see any handling for GET/SET_STATUS at all which
makes me wonder how we actually work. Viresh?

As far as I know the only back-end implementation of F_STATUS is in DPDK.  As I said, if anyone else implemented it right now, that would be dangerous, because qemu doesn’t adhere to the virtio protocol when it comes to the status byte.

Furthermore, the specification leaves much to
be desired, specifically in how F_STATUS interacts with other
vhost-user commands (which is something I cited as a reason for my
original patch), i.e. whether RESET_DEVICE and SET_STATUS 0 are
equivalent, and whether failures in feature negotiation must result in
both SET_FEATURES returning an error (with F_REPLY_ACK), and
FEATURES_OK being reset in the status byte, or whether either is
sufficient.  What happens when DEVICE_NEEDS_RESET is set, i.e. do we
just need RESET_DEVICE / SET_STATUS 0, or do we also need to reset
some protocol state?  (This is also connected to the fact that what
happens on RESET_DEVICE is largely undefined, which I said on Stefan’s
series.)
I'm all for strengthening the vhost-user protocol definitions. I'm just
wary of encoding QEMU<->backend implementation details.

In general, because we have our own transport, we should make a note
how it interacts with the status negotiation phases, i.e. that
GET_FEATURES must not be called before S_ACKNOWLEDGE | S_DRIVER are
set, that FEATURES_OK must be set after the SET_FEATURES call, and
that DRIVER_OK must not be set without FEATURES_OK set / SET_FEATURES
having returned success.  Here we would also answer the question about
the interaction of F_REPLY_ACK+SET_FEATURES with F_STATUS,
specifically whether an implementation with F_REPLY_ACK even needs to
read back the status byte after setting FEATURES_OK because it could
have got the feature negotiation result already as a result of the
SET_FEATURES call.
Some sequence diagrams would remove a lot of the ambiguity from parsing
the words. I wonder if there is a pretty way to do that to render nicely
in our published docs?

I’m sure some form of SVG will work.  Somehow.  If not, it should. :)

After migration, can you just set all flags immediately or do we need
to follow this step-by-step protocol?  I think we do need to do it
step-by-step, mostly for simplicity in the back-end, i.e. that it just
sees a normal device start-up.
Makes sense.

We should also clarify whether SET_STATUS can fail, i.e. whether
setting an invalid status (is setting FEATURES_OK when the device
doesn’t think so invalid?) has SET_STATUS fail (with F_REPLY_ACK)
and/or immediately gets the device into DEVICE_NEEDS_RESET.

We should clarify whether SET_STATUS can block.  The current use of
DRIVER_OK seems to indicate to me that dpdk does do time-consuming
operations when it sees DRIVER_OK (code looks like it, too) and only
returns when that’s done, but naïvely, I would expect SET_STATUS to be
just setting some value and doing whatever needs to be done in the
background, not actually launching and blocking on an operation.
Shouldn't the guest driver be reading the status bit until it flips? So
potentially there could be multiple GET_STATUS calls.

Ah, the device will only show DRIVER_OK set once the device is ready to serve the driver?

Hanna


Reply via email to