Jonathan Cameron <jonathan.came...@huawei.com> writes: > On Tue, 04 Jun 2024 11:18:18 +0200 > Markus Armbruster <arm...@redhat.com> wrote: > >> I finally got around to read this slowly. Thank you, Fan and Jonathan! >> >> I'm getting some "incomplete" vibes: "if we ever implement", "patches >> for this on list", "we aren't emulating this yet at all", and ... > > Absolutely. There is a bunch of stuff that we reject today but > the interfaces allow you to specify it and align with the CXL specification > Fabric Management API definition which is the spec used to control this > stuff from a BMC etc. If that doesn't work we have a hardware errata > problem, so hopefully that is fairly stable. > > I think I can publicly confirm there are some related errata in flight, > seeing as we said we'd raise questions on some aspects in the kernel and > QEMU series preceding this one (so no IP secrecy issues). These are as a > result of this work from Fan, but we have carefully avoided implementing > anything that 'may' change. > > >> >> Jonathan Cameron <jonathan.came...@huawei.com> writes: >> >> [...] >> >> > Only thing I'd add is that for now (because we don't need it for testing >> > the kernel flows) is that this does not provide any way for external >> > agents (e.g. our 'fabric manager' to find out what the state is - i.e. >> > if the extents have been accepted by the host etc). That stuff is all >> > defined by the spec, but not yet in the QMP interface. At somepoint >> > we may want to add that as a state query type interface. >> >> ... this, too. >> >> In review of v5, I asked whether this interface needs to be stable. >> >> "Not stable" doesn't mean we change an interface without thought. It >> merely means we can change it much, much faster, and with much less >> overhead. >> >> I understand you want it chiefly for CXL development. Development aids >> commonly don't need to be stable. > > Ok. If it makes sense to make this unstable for now I'm fine with that. > I don't see why it would change beyond in backwards compatible fashion > with new optional fields to cover future specification updates allowing > for more information. However I've been wrong on such things before. > > So I'm fine adding a patch on top of v8 marking them unstable for now.
I'd squash it into v8, but follow-up patch works for me, too. >> If you're aiming for stable, you need to convince us the interface can >> support the foreseeable purposes without incompatible changes. In >> particular, I'd like to see how you intend to support "finding out what >> the state is". I suspect that's related to my question in review of v8: >> how to detect completion, and maybe track progress. > > There is a need for completion information but given it's completely > asynchronous to the commands defined here (Can be out of order, lots of > ongoing capacity add / remove flows in flight etc) and for the hardware > definition the feedback occurs via an event record log I'm not expecting it > to affect the interfaces added so far. > >> >> There's infrastructure for background jobs: job.json. We might be >> better off using it, unless completion is trivial and progress tracking >> unnecessary. > > I'll take a look, but these are not conventional background commands > (We do have those as well, but that's a whole different set of future > problems!) > > The actual command itself completes synchronously but not the flow > it kicked off which is not background as such because it may never > finish and involves lots of moving parts. This is similar to any > form of error injection. We inject the error synchronously and that > creates a bunch of records in various registers / firmware etc but > the actions the host OS takes are asynchronous and may only happen > when it decides to poll some register or a driver loads much later. > > So I'm not sure if job.json approach fits. Neither am I, but I want you to be aware of it, so you can make an informed decision :) >> [...]