On 2021-10-29 17:51, Jerin Jacob wrote: > On Fri, Oct 29, 2021 at 5:27 PM Mattias Rönnblom > <mattias.ronnb...@ericsson.com> wrote: >> On 2021-10-25 11:03, Jerin Jacob wrote: >>> On Mon, Oct 25, 2021 at 1:05 PM Mattias Rönnblom >>> <mattias.ronnb...@ericsson.com> wrote: >>>> On 2021-10-19 20:14, jer...@marvell.com wrote: >>>>> From: Jerin Jacob <jer...@marvell.com> >>>>> >>>>> >>>>> Dataplane Workload Accelerator library >>>>> ====================================== >>>>> >>>>> Definition of Dataplane Workload Accelerator >>>>> -------------------------------------------- >>>>> Dataplane Workload Accelerator(DWA) typically contains a set of CPUs, >>>>> Network controllers and programmable data acceleration engines for >>>>> packet processing, cryptography, regex engines, baseband processing, etc. >>>>> This allows DWA to offload compute/packet processing/baseband/ >>>>> cryptography-related workload from the host CPU to save the cost and >>>>> power. >>>>> Also to enable scaling the workload by adding DWAs to the Host CPU as >>>>> needed. >>>>> >>>>> Unlike other devices in DPDK, the DWA device is not fixed-function >>>>> due to the fact that it has CPUs and programmable HW accelerators. >>>> There are already several instances of DPDK devices with pure-software >>>> implementation. In this regard, a DPU/SmartNIC represents nothing new. >>>> What's new, it seems to me, is a much-increased need to >>>> configure/arrange the processing in complex manners, to avoid bouncing >>>> everything to the host CPU. >>> Yes and No. It will be based on the profile. The TLV type TYPE_USER_PLANE >>> will >>> have user plane traffic from/to host. For example, offloading ORAN split 7.2 >>> baseband profile. Transport blocks sent to/from host as TYPE_USER_PLANE. >>> >>>> Something like P4 or rte_flow-based hooks or >>>> some other kind of extension. The eventdev adapters solve the same >>>> problem (where on some systems packets go through the host CPU on their >>>> way to the event device, and others do not) - although on a *much* >>>> smaller scale. >>> Yes. Eventdev Adapters only for event device plumbing. >>> >>> >>>> "Not-fixed function" seems to call for more hot plug support in the >>>> device APIs. Such functionality could then be reused by anything that >>>> can be reconfigured dynamically (FPGAs, firmware-programmed >>>> accelerators, etc.), >>> Yes. >>> >>>> but which may not be able to serve as a RPC >>>> endpoint, like a SmartNIC. >>> It can. That's the reason for choosing TLVs. So that >>> any higher level language can use TLVs like >>> https://protect2.fireeye.com/v1/url?k=96886daf-c91357b6-96882d34-8682aaa22bc0-c994a5dcbda5d9e8&q=1&e=e89c0aca-a3b3-4f72-b616-ba4550b856b6&u=https%3A%2F%2Fgithub.com%2Fustropo%2Futtlv >>> to communicate with the accelerator. TLVs follow the request and >>> response scheme like RPC. So it can warp it under application if needed. >>> >>>> DWA could be some kind of DPDK-internal framework for managing certain >>>> type of DPUs, but should it be exposed to the user application? >>> Could you clarify a bit more. >>> The offload is represented as a set of TLVs in generic fashion. There >>> is no DPU specific bit in offload representation. See >>> rte_dwa_profiile_l3fwd.h header file. >> >> It seems a bit cumbersome to work with TLVs on the user application >> side. Would it be an alternative to have the profile API as a set of C >> APIs instead of TLV-based messaging interface? The underlying >> implementation could still be - in many or all cases - be TLVs sent over >> some appropriate transport. > The reason to pick TLVs is as follows > > 1) Very easy to enable ABI compatibility. (Learned from rte_flow)
Do you include the TLV-defined profile interface in "ABI"? Or do you with ABI only mean the C ABI to send/receive TLVs? To me, the former makes the most sense, since changing the profile will break binary compatibility with then-existing applications. > 2) If it needs to be transported over network etc it needs to be > packed so that way > it is easy for implementation to do that with TLV also it gives better > performance in such > cases by avoiding reformatting or possibly avoiding memcpy etc. My question was not "why TLVs", but the more specific "why are TLVs exposed to the user application." I find it likely the user applications are going to wrap the TLV serialization and de-serialization into their own functions. > 3) It is easy to plugin with another high-level programing language as > just one API Make sense. One note though: the transport is just one API, but then each profile makes up an API as well, although it's not C, but TLV-based. > 4) Easy to decouple DWA core library functionalities from profile. > 5) Easy to enable asynchronous scheme using request and response TLVs. > 6) Most importantly, We could introduce type notion with TLV > (connected with the type of message See TYPE_ATTACHED, TYPE_STOPPED, > TYPE_USER_PLANE etc ), > That way, we can have a uniform outlook of profiles instead of each profile > coming with a setup of its own APIs and __rules__ on the state machine. > I think, for a framework to leverage communication mechanisms and other > aspects between profiles, it's important to have some synergy between > profiles. > > > Yes. I agree that a bit more logic is required on the application side > to use TLV, > But I think we can have a wrapper function getting req and response > structures. Do you think ethdev, eventdev, cryptodev and the other DPDK APIs had been better off as TLV-based messaging interfaces as well? From a user point of view, I'm not sure I see what's so special about talking to a SmartNIC compared to functions implemented in a GPU, FPGA, an fix-function ASIC, a large array of garden gnomes or some other manner. More functionality and more need for asynchronicity (if that's a word) maybe. >> Such a C API could still be asynchronous, and still be a profile API >> (rather than a set of new DPDK device types). >> >> >> What I tried to ask during the meeting but where I didn't get an answer >> (or at least one that I could understand) was how the profiles was to be >> specified and/or documented. Maybe the above is what you had in mind >> already. > Yes. Documentation is easy, please check the RFC header file for Doxygen > meta to express all the attributes of a TLV. > > > +enum rte_dwa_port_host_ethernet { > + /** > + * Attribute | Value > + * ----------|-------- > + * Tag | RTE_DWA_TAG_PORT_HOST_ETHERNET > + * Stag | RTE_DWA_STAG_PORT_HOST_ETHERNET_H2D_INFO > + * Direction | H2D > + * Type | TYPE_ATTACHED > + * Payload | NA > + * Pair TLV | RTE_DWA_STAG_PORT_HOST_ETHERNET_D2H_INFO > + * > + * Request DWA host ethernet port information. > + */ > + RTE_DWA_STAG_PORT_HOST_ETHERNET_H2D_INFO, > + /** > + * Attribute | Value > + * ----------|--------- > + * Tag | RTE_DWA_TAG_PORT_HOST_ETHERNET > + * Stag | RTE_DWA_STAG_PORT_HOST_ETHERNET_D2H_INFO > + * Direction | H2D > + * Type | TYPE_ATTACHED > + * Payload | struct rte_dwa_port_host_ethernet_d2h_info > + * Pair TLV | RTE_DWA_STAG_PORT_HOST_ETHERNET_H2D_INFO > + * > + * Response for DWA host ethernet port information. > + */ > + RTE_DWA_STAG_PORT_HOST_ETHERNET_D2H_INFO, Thanks for the pointer. It would make sense to have a machine-readable schema, so you can generate the (in my view) inevitable wrapper code. Much like what gRPC is to protobuf, or Sun RPC to XDR. Why not use protobuf and its IDL to specify the interface?