On Tue, 04 Jun 2024 11:18:18 +0200 Markus Armbruster <arm...@redhat.com> wrote:
> I finally got around to read this slowly. Thank you, Fan and Jonathan! > > I'm getting some "incomplete" vibes: "if we ever implement", "patches > for this on list", "we aren't emulating this yet at all", and ... Absolutely. There is a bunch of stuff that we reject today but the interfaces allow you to specify it and align with the CXL specification Fabric Management API definition which is the spec used to control this stuff from a BMC etc. If that doesn't work we have a hardware errata problem, so hopefully that is fairly stable. I think I can publicly confirm there are some related errata in flight, seeing as we said we'd raise questions on some aspects in the kernel and QEMU series preceding this one (so no IP secrecy issues). These are as a result of this work from Fan, but we have carefully avoided implementing anything that 'may' change. > > Jonathan Cameron <jonathan.came...@huawei.com> writes: > > [...] > > > Only thing I'd add is that for now (because we don't need it for testing > > the kernel flows) is that this does not provide any way for external > > agents (e.g. our 'fabric manager' to find out what the state is - i.e. > > if the extents have been accepted by the host etc). That stuff is all > > defined by the spec, but not yet in the QMP interface. At somepoint > > we may want to add that as a state query type interface. > > ... this, too. > > In review of v5, I asked whether this interface needs to be stable. > > "Not stable" doesn't mean we change an interface without thought. It > merely means we can change it much, much faster, and with much less > overhead. > > I understand you want it chiefly for CXL development. Development aids > commonly don't need to be stable. Ok. If it makes sense to make this unstable for now I'm fine with that. I don't see why it would change beyond in backwards compatible fashion with new optional fields to cover future specification updates allowing for more information. However I've been wrong on such things before. So I'm fine adding a patch on top of v8 marking them unstable for now. > > If you're aiming for stable, you need to convince us the interface can > support the foreseeable purposes without incompatible changes. In > particular, I'd like to see how you intend to support "finding out what > the state is". I suspect that's related to my question in review of v8: > how to detect completion, and maybe track progress. There is a need for completion information but given it's completely asynchronous to the commands defined here (Can be out of order, lots of ongoing capacity add / remove flows in flight etc) and for the hardware definition the feedback occurs via an event record log I'm not expecting it to affect the interfaces added so far. > > There's infrastructure for background jobs: job.json. We might be > better off using it, unless completion is trivial and progress tracking > unnecessary. I'll take a look, but these are not conventional background commands (We do have those as well, but that's a whole different set of future problems!) The actual command itself completes synchronously but not the flow it kicked off which is not background as such because it may never finish and involves lots of moving parts. This is similar to any form of error injection. We inject the error synchronously and that creates a bunch of records in various registers / firmware etc but the actions the host OS takes are asynchronous and may only happen when it decides to poll some register or a driver loads much later. So I'm not sure if job.json approach fits. > > [...] > >