On Tue, Jun 27, 2023 at 11:17 AM Parav Pandit <[email protected]> wrote:
>
>
>
> > From: Jason Wang <[email protected]>
> > Sent: Monday, June 26, 2023 10:38 PM
>
> > > >
> > > Because there is no point of failing it later when actual rw occurs.
> >
> > Even with admin virtqueue, the hypervisor should be ready for any admin
> > command failures somehow.
> >
> Yes, device failures can happen regardless.
>
> > And with modern devices, the reset is not a must, the hypervisor can just
> > read
> > device_features to check if _MAC/_HDR is there.
>
> >
> I read the spec differently as,
> The driver MUST follow this sequence to initialize a device:
> 1. Reset the device.
It's not the initialization, it's a probing of supported features.
>
> > And what's more important,
> > the hypervisor can know if the feature negotiation succeeds or not. With
> > legacy, there's no such way.
> Again, we are not fixing the legacy here.
I mean, it's not worse than failing silently in the legacy case so we
don't need to care.
>
> > >
> > > > > 1. reset the device
> > > > > 2. set ACK and DRIVER bit
> > > > > 3. read features and make sure _MAC, _HDR are supported.
> > > >
> > > > Similar step is required even for AQ since the hypervisor needs to
> > > > know if legacy commands are supported there.
> > > >
> > > No. similar steps are not required on AQ because AQ support discovery is
> > done once per owner device. Not for each member device.
> >
> > This seems a topic beyond legacy. If both modern and transitional devices
> > are
> > supported, a per device probing or provisioning is still required. And if
> > provisioning is supported, there's no need to probe MAC/HDR.
> >
> That is possible to rely on the AQ.
> > >
> > > > > 4. if not abort.
> > > > > 5. reset the device 2nd time so that guest can have right view of
> > > > > things.
> > > > >
> > > > > On guest access of legacy area:
> > > > > 6. reset the device
> > > > > 7. set ACK and DRIVER bit on guest request 8. read features and
> > > > > make sure _MAC, _HDR are offered
> > > > > 8.1 Hope for the best that on two completely unrelated device
> > > > > reset on #1
> > > > and #6, the device offers exact same feature bits.
> > > >
> > > > Explained and discussed in another thread, this is a hard
> > > > requirement otherwise a lot of things would break.
> > > >
> > > That limitation is for the past or current.
> >
> > You need at least a new feature bit.
> >
> Not needed, current mechanism is just fine as there is member device driver
> to owner communication.
>
> > > As we move forward there is no need to continue with such limitation.
> > > There may be more work needed to overcome that limitation, but we don’t
> > want to build on top of that limitation.
> >
> > Let's open a dedicated thread for this if necessary.
> >
> Ok. We don’t want to have that assumption and build interface.
>
> > >
> > > > > And mention this in the spec 1.x for rest of the future.
> > > > >
> > > > > We shouldn't be adding such a limitation to the spec.
> > > > > We have seen this with mlx5 device where 5 years ago one thought
> > > > > this
> > > > cannot happen.
> > > > > And now with modern use case now features changes across two
> > > > > resets for
> > > > mlx5 device.
> > > > > Baking such limitation in current spec for past 1.x is sub-optimal.
> > > > >
> > > > > ok, so to avoid baking such reset flow nasty things in spec, lets
> > > > > avoid flow of
> > > > #1 to #5 in hypervisor.
> > > > > So, provision the device to support these new feature bits via AQ.
> > > > > So AQ is required for feature provisioning anyway.
> > > > >
> > > > > So, your point #4 is required in both methods and so it scores
> > > > > same as this
> > > > proposal.
> > > > > Hence feature bit is not of an advantage.
> > > > >
> > > > > With _MAC now we need writable mac on 1.x config space.
> > > > > for 1.x writable mac has two requirements.
> > > > > 1. It must be atomic (not for 0.9.5, but must for 1.x like CVQ)
> > > >
> > > > For LEGACY_MAC, we don't need to be atomic. We manage to survive for
> > > > several years.
> > > >
> > > It was cited as issue in past discussed when you and Michael both asked to
> > consider tvq.
> > > What changed now?
> >
> > Nothing changed, what I meant is: with admin virtqueue it could be atomic or
> > not.
> >
> I don’t know why you keep circling back on it.
>
> Why do you ask below "how atomicity is achieved if it is not important?"
>
> > >
> > > The issue is that device hw needs implement special RW registers now for
> > > the
> > mac for legacy.
> > > This is the unwanted burden not needed, where AQ is better.
> >
> > How can you solve the atomic issue in this case? The issue is the driver
> > where
> > legacy guests write to device configuration space byte by byte.
> You contradict yourself, above you say that we managed, here you say its
> issue.
>
> It does write byte by byte, a hypervisor can collect 6 bytes and write all
> together in one go.
>
> It seems that now the past issue of atomicity was raised and solved with AQ,
> you start to oppose that it...
Let me clarify:
Let's first look at the ABI solely:
+struct virtio_admin_cmd_ld_reg_wr_data {
+ u8 offset; /* Starting byte offset of the device register(s) to write */
+ u8 register[];
+};
For accessing mac, it could be atomic or not.
Hypervisor can try to collect 6 bytes for atomicy but again it's best
effort and may (for Linux) or may not work for all kinds of guests.
>
> > Modern guests will use cvq.
> >
> Sure.
>
> > >
> > > > > 2. should be synchronous to know success/failure to know when it
> > > > > is effective
> > > >
> > > > It's as simple as allowing mac in the device configuration space
> > > > writable.
> > > >
> > > Not simple for hw at scale.
> >
> > I don't understand, it uses modern device configuration space. How does it
> > matter with the scale of admin virtqueue?
> >
> The new feature bit demands mac address in config space to be read write.
> Modern device doesnt have it as RW.
> This burden is unnecessary to carry forward.
How much burden if we just make 6 byte RW per device?
> An AQ eliminates such heavy device side implementation.
>
> > > AQ is superior that communicates rarely used information.
> > >
> > > > >
> > > > > Both are present on the CVQ, so yet add another duplicate 1.x
> > > > > scheme that
> > > > fulfill above requirements.
> > > >
> > > > I'm not sure how to define duplication, the legacy device
> > > > configuration space access via admin virtqueue is kind of a duplication
> > > > in
> > this sense as well.
> > > >
> > > It only exists on AQ for legacy purpose.
> > > The feature bit defined is not for legacy, hence its duplicated.
> >
> > It's for legacy mediation and what's more, it allows mac to be configured
> > without cvq.
> >
> This fundamental thing is not a gain, but a negative aspect to complicate the
> device PCI.
Is this minor stuff more complicated than the entire admin virtqueue stuff?
The point is methodology, mediation on top of modern devices is
simpler, and a lot of other methods could be used if LEGACY_MAC
doesn't suffice.
> It even contradicts for current spec definition " Device configuration space
> is generally used for rarely-changing or initialization-time parameters".
I don't see how, mac is rarely changed, and for blocks we have
writeable stuffs like wce as well.
>
> > > I don’t see how that is avoided when one side has undefined corner case
> > > that
> > is translated to well defined behavior.
> >
> > Let me give you an example. The legacy device behaviour is basically what
> > Qemu codes did, by reusing the existing Qemu virtio device model codes, most
> > of the issues can be avoided.
> >
> You are welcome to document them in legacy interface section.
Well, you can't describe the behaviour of several kilo lines of code precisely.
>
> > > It only means that piece will be in endless errata in sw.
> >
> > Errata in software is much more simpler/cheaper than hardware, isn't it?
> >
> AQ has ability to implement things in less hw centric way.
> And sw when necessary can always decide which AQ commands to use.
> So it is more flexible.
A lot of examples have been demonstrated, software erratas is much
more flexible. It's really common to workaround hardware errata in
kernel or BIOS.
>
> > > It still scores same or less.
> > >
> > > > >
> > > > > For legacy there is no extra/special documentation.
> > > > > All the behavior listed in Legacy interfaces section for
> > > > > configuration register
> > > > present applies here.
> > > >
> > > > That's not true, the legacy section is the best effort since it's
> > > > too late to define (codes came before the spec).
> > > >
> > > One can added the missing documentation...
> >
> > It will be too late for the legacy, and it would be too hard since it
> > basically tries
> > to describe the exact code behaviour. And there could be a lot of very hard
> > questions that need to be answered. E.g what happens if a driver tries to
> > use
> > both modern and legacy interfaces?
> >
> It is no different than what is defined already in transitional device.
Can you show me where the spec answers this question?
Thanks
> No point repeating same questions again, please.
>
>
> > >
> > >
> > > > >
> > > > > So, I don't see mac support is trivial by any means compared to
> > > > > proposed
> > > > scheme for 1.x.
> > > > > Hence, comparing trivialness of two solutions seems same for your
> > > > > point #1
> > > > once you do mac plumbing.
> > > > >
> > > > > Regarding #2 on nesting. I won't claim I understand this as your
> > > > > level of
> > > > knowledge.
> > > > > If you are sauing only the VF is in VM as virtio PCI device to
> > > > > supporting nested
> > > > guest, that doesn't have AQ, hence it doesn't work due above issues.
> > > >
> > > > This is the most common setup, no?
> > > >
> > > Depends on who see it.
> > > We don’t see common setup for legacy or otherwise.
> > >
> > > > The major issues are the transport function is not self contained in
> > > > a device (function). This will cause a lot of trouble when trying to
> > > > implement nesting. E.g you need to assign or emulate SR-IOV from L1
> > > > to
> > > > L(N-1) in order to let L(N) to work.
> > > >
> > > Those who do nesting usually do assign the PF as well.
> >
> > This is not what I was told and it's not flexible.
> One food type does not solve hunger for whole world.
> So one can come up with more elaborate solution as time progresses, we are
> hoping that legacy doesnt reach to that point.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]