Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library

Mattias Rönnblom Fri, 29 Oct 2021 04:57:26 -0700

On 2021-10-25 11:03, Jerin Jacob wrote:
> On Mon, Oct 25, 2021 at 1:05 PM Mattias Rönnblom
> <mattias.ronnb...@ericsson.com> wrote:
>> On 2021-10-19 20:14, jer...@marvell.com wrote:
>>> From: Jerin Jacob <jer...@marvell.com>
>>>
>>>
>>> Dataplane Workload Accelerator library
>>> ======================================
>>>
>>> Definition of Dataplane Workload Accelerator
>>> --------------------------------------------
>>> Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
>>> Network controllers and programmable data acceleration engines for
>>> packet processing, cryptography, regex engines, baseband processing, etc.
>>> This allows DWA to offload  compute/packet processing/baseband/
>>> cryptography-related workload from the host CPU to save the cost and power.
>>> Also to enable scaling the workload by adding DWAs to the Host CPU as 
>>> needed.
>>>
>>> Unlike other devices in DPDK, the DWA device is not fixed-function
>>> due to the fact that it has CPUs and programmable HW accelerators.
>>
>> There are already several instances of DPDK devices with pure-software
>> implementation. In this regard, a DPU/SmartNIC represents nothing new.
>> What's new, it seems to me, is a much-increased need to
>> configure/arrange the processing in complex manners, to avoid bouncing
>> everything to the host CPU.
> Yes and No. It will be based on the profile. The TLV type TYPE_USER_PLANE will
> have user plane traffic from/to host. For example, offloading ORAN split 7.2
> baseband profile. Transport blocks sent to/from host as TYPE_USER_PLANE.
>
>> Something like P4 or rte_flow-based hooks or
>> some other kind of extension. The eventdev adapters solve the same
>> problem (where on some systems packets go through the host CPU on their
>> way to the event device, and others do not) - although on a *much*
>> smaller scale.
> Yes. Eventdev Adapters only for event device plumbing.
>
>
>>
>> "Not-fixed function" seems to call for more hot plug support in the
>> device APIs. Such functionality could then be reused by anything that
>> can be reconfigured dynamically (FPGAs, firmware-programmed
>> accelerators, etc.),
> Yes.
>
>> but which may not be able to serve as a RPC
>> endpoint, like a SmartNIC.
> It can. That's the reason for choosing TLVs. So that
> any higher level language can use TLVs like 
> https://protect2.fireeye.com/v1/url?k=96886daf-c91357b6-96882d34-8682aaa22bc0-c994a5dcbda5d9e8&q=1&e=e89c0aca-a3b3-4f72-b616-ba4550b856b6&u=https%3A%2F%2Fgithub.com%2Fustropo%2Futtlv
> to communicate with the accelerator.  TLVs follow the request and
> response scheme like RPC. So it can warp it under application if needed.
>
>>
>> DWA could be some kind of DPDK-internal framework for managing certain
>> type of DPUs, but should it be exposed to the user application?
>
> Could you clarify a bit more.
> The offload is represented as a set of TLVs in generic fashion. There
> is no DPU specific bit in offload representation. See
> rte_dwa_profiile_l3fwd.h header file.



It seems a bit cumbersome to work with TLVs on the user application 
side. Would it be an alternative to have the profile API as a set of C 
APIs instead of TLV-based messaging interface? The underlying 
implementation could still be - in many or all cases - be TLVs sent over 
some appropriate transport.


Such a C API could still be asynchronous, and still be a profile API 
(rather than a set of new DPDK device types).


What I tried to ask during the meeting but where I didn't get an answer 
(or at least one that I could understand) was how the profiles was to be 
specified and/or documented. Maybe the above is what you had in mind 
already.


> TB hosted a meeting for this at Date: Wednesday, October 27th Time:
> 3pm UTC, https://meet.jit.si/DPDK
> Feel free to join.
>
>
>>
>>> This enables DWA personality/workload to be completely programmable.
>>> Typical examples of DWA offloads are Flow/Session management,
>>> Virtual switch, TLS offload, IPsec offload, l3fwd offload, etc.
>>> Motivation for the new library
>>> ------------------------------
>>> Even though, a lot of semiconductor vendors offers a different form of DWA,
>>> such as DPU(often called Smart-NIC), GPU, IPU, XPU, etc.,
>>> Due to the lack of standard APIs to "Define the workload" and
>>> "Communication between HOST and DWA", it is difficult for DPDK
>>> consumers to use them in a portable way across different DWA vendors
>>> and enable it in cloud environments.
>>>
>>>
>>> Contents of RFC
>>> ------------------
>>> This RFC attempts to define standard APIs for:
>>>
>>> 1) Definition of Profiles corresponding to well defined workloads, which 
>>> includes
>>>      a set of TLV(Messages) as a request  and response scheme to define
>>>      the contract between host and DWA to offload a workload.
>>>      (See lib/dwa/rte_dwa_profile_* header files)
>>> 2) Discovery of a DWAs capabilities (e.g. which specific workloads it can 
>>> support)
>>>      in a vendor independent fashion. (See rte_dwa_dev_disc_profiles())
>>> 3) Attaching a set of profiles to a DWA device(See rte_dwa_dev_attach())
>>> 4) A communication framework between Host and DWA(See rte_dwa_ctrl_op() for
>>>      control plane and rte_dwa_port_host_* for user plane)
>>> 5) Virtualization of DWA hardware and firmware (Use standard DPDK 
>>> device/bus model)
>>> 6) Enablement of administrative functions such as FW updates,
>>>      resource partitioning in a DWA like items in global in
>>>      nature that is applicable for all DWA device under the DWA.
>>>      (See rte_dwa_profile_admin.h)
>>>
>>> Also, this RFC define the L3FWD profile to offload L3FWD workload to DWA.
>>> This RFC defines an ethernet-style host port for Host to DWA communication.
>>> Different host port types may be required to cover the large spectrum of 
>>> DWA types as
>>> transports like PCIe DMA, Shared Memory, or Ethernet are fundamentally 
>>> different,
>>> and optimal performance need host port specific APIs.
>>>
>>> The framework does not force an abstract of different transport interfaces 
>>> as
>>> single API, instead, decouples TLV from the transport interface and focuses 
>>> on
>>> defining the TLVs and leaving vendors to specify the host ports
>>> specific to their DWA architecture.
>>>
>>>
>>> Roadmap
>>> -------
>>> 1) Address the comments for this RFC and enable the common code
>>> 2) SW drivers/infrastructure for `DWA` and `DWA device`
>>> as two separate DPDK processes over `memif` DPDK ethdev driver for
>>> L3FWD offload. This is to enable the framework without any special HW.
>>> 3) Example DWA device application for L3FWD profile.
>>> 4) Marvell DWA Device drivers.
>>> 5) Based on community interest new profile can be added in the future.
>>>
>>>
>>> DWA library framework
>>> ---------------------
>>>
>>> DWA components:
>>>
>>>                                                     +--> 
>>> rte_dwa_port_host_*()
>>>                                                     |  (User Plane traffic 
>>> as TLV)
>>>                                                     |
>>>                    +----------------------+         |   
>>> +--------------------+
>>>                    |                      |         |   | DPDK DWA 
>>> Device[0] |
>>>                    |  +----------------+  |  Host Port  | 
>>> +----------------+ |
>>>                    |  |                |  |<========+==>| |                
>>> | |
>>>                    |  |   Profile 0    |  |             | |   Profile X    
>>> | |
>>>                    |  |                |  |             | |                
>>> | |
>>>     <=============>|  +----------------+  | Control Port| 
>>> +----------------+ |
>>>       DWA Port0    |  +----------------+  |<========+==>|                   
>>>  |
>>>                    |  |                |  |         |   
>>> +--------------------+
>>>                    |  |   Profile 1    |  |         |
>>>                    |  |                |  |         +--> rte_dwa_ctrl_op()
>>>                    |  +----------------+  |         (Control Plane traffic 
>>> as TLV)
>>>     <=============>|      Dataplane       |
>>>       DWA Port1    |      Workload        |
>>>                    |      Accelerator     |             +---------- 
>>> ---------+
>>>                    |      (HW/FW/SW)      |             | DPDK DWA 
>>> Device[N] |
>>>                    |                      |  Host Port  | 
>>> +----------------+ |
>>>     <=============>|  +----------------+  |<===========>| |                
>>> | |
>>>       DWA PortN    |  |                |  |             | |   Profile Y    
>>> | |
>>>                    |  |    Profile N   |  |             | |           ^    
>>> | |
>>>                    |  |                |  | Control Port| 
>>> +-----------|----+ |
>>>                    |  +-------|--------+  |<===========>|             |     
>>>  |
>>>                    |          |           |             
>>> +-------------|------+
>>>                    +----------|-----------+                           |
>>>                               |                                       |
>>>                               +---------------------------------------+
>>>                                                          ^
>>>                                                          |
>>>                                                          
>>> +--rte_dwa_dev_attach()
>>>
>>>
>>> Dataplane Workload Accelerator: It is an abstract model. The model is
>>> capable of offloading the dataplane workload from application via
>>> DPDK API over host and control ports of a DWA device.
>>> Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
>>> Network controllers, and programmable data acceleration engines for
>>> packet processing, cryptography, regex engines, base-band processing, etc.
>>> This allows DWA to offload compute/packet 
>>> processing/base-band/cryptography-related
>>> workload from the host CPU to save cost and power. Also,
>>> enable scaling the workload by adding DWAs to the host CPU as needed.
>>>
>>> DWA device: A DWA can be sliced to N number of DPDK DWA device(s)
>>> based on the resources available in DWA.
>>> The DPDK API interface operates on the DPDK DWA device.
>>> It is a representation of a set of resources in DWA.
>>>
>>> TLV: TLV (tag-length-value) encoded data stream contain tag as
>>> message ID, followed by message length, and finally the message payload.
>>> The 32bit message ID consists of two parts, 16bit Tag and 16bit Subtag.
>>> The tag represents ID of the group of the similar message,
>>> whereas, subtag represents a message tag ID under the group.
>>>
>>> Control Port: Used for transferring the control plane TLVs. Every DPDK
>>> DWA device must have a control port. Only one outstanding TLV can be
>>> processed via this port by a single DWA device. This makes the control
>>> port suitable for the control plane.
>>>
>>> Host Port: Used for transferring the user plane TLVs.
>>> Ethernet, PCIe DMA, Shared Memory, etc.are the example of
>>> different transport mechanisms abstracted under the host port.
>>> The primary purpose of host port to decouple the user plane TLVs with
>>> underneath transport mechanism differences.
>>> Unlike control port, more than one outstanding TLVs can be processed by
>>> a single DWA device via this port.
>>> This makes, the host port transfer to be in asynchronous nature,
>>> to support large volumes and less latency user plane traffic.
>>>
>>> DWA Port: Used for transferring data between the external source and DWA.
>>> Ethernet, eCPRI are examples of DWA ports. Unlike host ports,
>>> the host CPU is not involved in transferring the data to/from DWA ports.
>>> These ports typically connected to the Network controller inside the
>>> DWA to transfer the traffic from the external source.
>>>
>>> TLV direction: `Host to DWA` and `DWA to Host` are the directions
>>> of TLV messages. The former one is specified as H2D, and the later one is
>>> specified as D2H. The H2D control TLVs, used for requesting DWA to perform
>>> specific action and D2H control TLVs are used to respond to the requested
>>> actions. The H2D user plane messages are used for transferring data from the
>>> host to the DWA. The D2H user plane messages are used for transferring
>>> data from the DWA to the host.
>>>
>>> DWA device states: Following are the different states of a DWA device.
>>> - READY: DWA Device is ready to attach the profile.
>>> See rte_dwa_dev_disc_profiles() API to discover the profile.
>>> - ATTACHED: DWA Device attached to one or more profiles.
>>> See rte_dwa_dev_attach() API to attach the profile(s).
>>> - STOPPED: Profile is in the stop state.
>>> TLV type `TYPE_ATTACHED`and `TYPE_STOPPED` messages are valid in this state.
>>> After rte_dwa_dev_attach() or explicitly invoking the rte_dwa_stop() API
>>> brings device to this state.
>>> - RUNNING: Invoking rte_dwa_start() brings the device to this state.
>>> TLV type `TYPE_STARTED` and `TYPE_USER_PLANE` are valid in this state.
>>> - DETACHED: Invoking rte_dwa_dev_detach() brings the device to this state.
>>> The device and profile must be in the STOPPED state prior to
>>> invoking the rte_dwa_dev_detach().
>>> - CLOSED: Closed a stopped/detached DWA device.The device cannot be 
>>> restarted!.
>>> Invoking rte_dwa_dev_close() brings the device to this state.
>>>
>>> TLV types: Following are the different TLV types
>>> - TYPE_ATTACHED: Valid when the device is in `ATTACHED`, `STOPPED` and 
>>> `RUNNING` state.
>>> - TYPE_STOPPED: Valid when the device is in `STOPPED` state.
>>> - TYPE_STARTED: Valid when the device is in `RUNNING` state.
>>> - TYPE_USER_PLANE: Valid when the device is in `RUNNING` state and
>>> used to transfer only user plane traffic.
>>>
>>> Profile: Specifies a workload that dataplane workload accelerator
>>> process on behalf of a DPDK application through a DPDK DWA device.
>>> A profile is expressed as a set of TLV messages for control plane and user 
>>> plane
>>> functions. Each TLV message must have Tag, SubTag, Direction, Type, Payload 
>>> attributes.
>>>
>>> Programming model: Typical application programming sequence is as follows,
>>> 1) In the EAL initialization phase, the DWA devices shall be probed,
>>>      the application can query the number of available DWA devices with
>>>      rte_dwa_dev_count() API.
>>> 2) Application discovers the available profile(s) in a DWA device using
>>>      rte_dwa_dev_disc_profiles() API.
>>> 3) Application attaches one or more profile(s) to a DWA device using
>>>      rte_dwa_dev_attach().
>>> 4) Once the profile is attached, The device shall be in the STOPPED state.
>>>      Configure the profile(s) with `TYPE_ATTACHED`and `TYPE_STOPPED`
>>>      type TLVs using rte_dwa_ctrl_op() API.
>>> 5) Once the profile is configured, move the profile to the `RUNNING` state
>>>      by invoking rte_dwa_start() API.
>>> 6) Once the profile is in running state and if it has user plane TLV,
>>>      transfer those TLVs using rte_dwa_port_host_() API based on the 
>>> available
>>>      host port for the given profile attached.
>>> 7) Application can change the dynamic configuration aspects in
>>>      `RUNNING` state using rte_dwa_ctrl_op() API by issuing `TYPE_STARTED` 
>>> type
>>>      of TLV messages.
>>> 8) Finally, use rte_dwa_stop(), rte_dwa_dev_detach(), rte_dwa_dev_close()
>>>      sequence for tear-down.
>>>
>>>
>>> L3FWD profile
>>> -------------
>>>
>>>                                +-------------->--[1]--------------+
>>>                                |                                  |
>>>                    +-----------|----------+                       |
>>>                    |           |          |                       |
>>>                    |  +--------|-------+  |                       |
>>>                    |  |                |  |                       |
>>>                    |  | L3FWD Profile  |  |                       |
>>>         \          |  |                |  |                       |
>>>     <====\========>|  +----------------+  |                       |
>>>       DWA \Port0   |     Lookup Table     |             
>>> +---------|----------+
>>>            \       |  +----------------+  |             | DPDK 
>>> DWA|Device[0] |
>>>             \      |  | IP    | Dport  |  |  Host Port  | 
>>> +-------|--------+ |
>>>              \     |  +----------------+  |<===========>| |       |        
>>> | |
>>>               +~[3]~~~|~~~~~~~|~~~~~~~~|~~~~~~~~~~~~~~~~~>|->L3FWD Profile 
>>> | |
>>>     <=============>|  +----------------+  |             | |                
>>> | |
>>>       DWA Port1    |  |       |        |  | Control Port| 
>>> +-|---------|----+ |
>>>                    |  +----------------+  |<===========>|   |         |     
>>>  |
>>>       ~~~>~~[5]~~~~|~~|~~~+   |        |  |             
>>> +---|---------|------+
>>>                    |  +---+------------+  |                 |         |
>>>       ~~~<~~~~~~~~~|~~|~~~+   |        |<-|------[2]--------+         |
>>>                    |  +----------------+<-|------[4]------------------+
>>>                    |    Dataplane         |
>>>     <=============>|    Workload          |
>>>       DWA PortN    |    Accelerator       |
>>>                    |    (HW/FW/SW)        |
>>>                    +----------------------+
>>>
>>>
>>> L3FWD profile offloads Layer-3 forwarding between the DWA Ethernet ports.
>>>
>>> The above diagram depicts the profile and application programming sequence.
>>> 1) DWA device attaches the L3FWD profile using rte_dwa_dev_attach().
>>> 2) Configure the L3FWD profile:
>>> a) The application requests L3FWD profile capabilities of the DWA
>>>      by using RTE_DWA_STAG_PROFILE_L3FWD_H2D_INFO, On response,
>>>      the RTE_DWA_STAG_PROFILE_L3FWD_D2H_INFO returns the lookup modes
>>>      supported, max rules supported, and available host ports for this 
>>> profile.
>>> b) The application configures a set of DWA ports to use a
>>>      lookup mode(EM, LPM, or FIB) via RTE_DWA_STAG_PROFILE_L3FWD_H2D_CONFIG.
>>> c) The application configures a valid host port to receive exception 
>>> packets.
>>> 3) The exception that is not matching forwarding table entry comes as
>>>      RTE_DWA_STAG_PROFILE_L3FWD_D2H_EXCEPTION_PACKETS TLV to host. DWA 
>>> stores the exception
>>>      packet send back destination ports after completing step (4).
>>> 4) Parse the exception packet and add rules to the FWD table using
>>>      RTE_DWA_STAG_PROFILE_L3FWD_H2D_LOOKUP_ADD. If the application knows 
>>> the rules beforehand,
>>>      it can add the rules in step 2.
>>> 5) When DWA ports receive the matching flows in the lookup table, DWA 
>>> forwards
>>>      to DWA Ethernet ports without host CPU intervention.
>>>
>>>
>>> Example application usage with L3FWD profile
>>> --------------------------------------------
>>> This example application is to demonstrate the programming model of DWA 
>>> library.
>>> This example omits the error checks to simply the application.
>>>
>>> void
>>> dwa_profile_l3fwd_add_rule(rte_dwa_obj_t obj obj, struct rte_mbuf *mbuf)
>>> {
>>>        struct rte_dwa_profile_l3fwd_h2d_lookup_add *lookup;
>>>        struct rte_dwa_tlv *h2d, *d2h;
>>>        struct rte_ether_hdr *eth_hdr;
>>>        struct rte_ipv4_hdr *ipv4_hdr;
>>>        uint32_t id;
>>>        size_t len;
>>>
>>>        id = RTE_DWA_TLV_MK_ID(PROFILE_L3FWD, H2D_LOOKUP_ADD);
>>>        len = sizeof(struct rte_dwa_profile_l3fwd_h2d_config);
>>>        h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>>        lookup = h2d->msg;
>>>           /* Simply hardcode to IPv4 instead of looking for Packet type to 
>>> simplify example */
>>>        lookup->rule_type = RTE_DWA_PROFILE_L3FWD_RULE_TYPE_IPV4;
>>>        lookup->v4_rule.prefix.depth = 24;
>>>
>>>        eth_hdr = rte_pktmbuf_mtod(mbuf, struct rte_ether_hdr *);
>>>        ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1);
>>>        lookup->v4_rule.prefix.ip_dst = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
>>>        lookup->eth_port_dst = mbuf->port;
>>>
>>>        rte_dwa_tlv_fill(h2d, id, len, h2d);
>>>        d2h = rte_dwa_ctrl_op(obj, h2h);
>>>        free(h2d);
>>>        free(d2h);
>>> }
>>>
>>> void
>>> dwa_profile_l3fwd_port_host_ethernet_worker(rte_dwa_obj_t obj, struct 
>>> app_ctx *ctx)
>>> {
>>>        struct rte_dwa_profile_l3fwd_d2h_exception_pkts *msg;
>>>        struct rte_dwa_tlv *tlv;
>>>        uint16_t i, rc, nb_tlvs;
>>>        struct rte_mbuf *mbuf;
>>>
>>>        while (!ctx->done) {
>>>                rc = rte_dwa_port_host_ethernet_rx(obj, 0, &tlv, 1);
>>>                if (!rc)
>>>                        continue;
>>>
>>>                /* Since L3FWD profile has only one User Plane TLV, Message 
>>> must be
>>>                 * RTE_DWA_STAG_PROFILE_L3FWD_D2H_EXCEPTION_PACKETS message
>>>                 */
>>>                msg = (struct rte_dwa_profile_l3fwd_d2h_exception_pkts 
>>> *)tlv->msg;
>>>                for (i = 0; i < msg->nb_pkts; i++) {
>>>                                mbuf = msg->pkts[i];
>>>                                /* Got a exception pkt from DWA, handle it 
>>> by adding as new rule in
>>>                                    * lookup table in DWA
>>>                                 */
>>>                                dwa_profile_l3fwd_add_rule(obj, mbuf);
>>>                                /* Free the mbuf to pool */
>>>                                rte_pktmbuf_free(mbuf);
>>>                }
>>>
>>>                /* Done with TLV mbuf container, free it back */
>>>                rte_mempool_ops_enqueue_bulk(ctx->tlv_pool, tlv, 1);
>>> }
>>>
>>> bool
>>> dwa_port_host_ethernet_config(rte_dwa_obj_t obj, struct app_ctx *ctx)
>>> {
>>>        struct rte_dwa_tlv info_h2d, *info_d2h, *h2d = NULL, *d2h;
>>>        struct rte_dwa_port_host_ethernet_d2h_info *info;
>>>        int tlv_pool_element_sz;
>>>        bool rc = false;
>>>        size_t len;
>>>
>>>        /* Get the Ethernet host port info */
>>>        id = RTE_DWA_TLV_MK_ID(PORT_HOST_ETHERNET, H2D_INFO);
>>>        rte_dwa_tlv_fill(&info_h2d, id, 0, NULL);
>>>        info_d2h = rte_dwa_ctrl_op(obj, &info_h2d)
>>>
>>>        info = rte_dwa_tlv_d2h_to_msg(info_d2h);
>>>        if (info == NULL)
>>>                goto fail;
>>>        /* Need min one Rx queue to Receive exception traffic */
>>>        if (info->nb_rx_queues == 0)
>>>                goto fail;
>>>        /* Done with message from DWA. Free back to implementation */
>>>        free(obj, info_d2h);
>>>
>>>        /* Allocate exception packet pool */
>>>        ctx->pkt_pool = rte_pktmbuf_pool_create("exception pool", /* Name */
>>>                                   ctx->pkt_pool_depth, /* Number of 
>>> elements*/
>>>                                   512, /* Cache size*/
>>>                                   0,
>>>                                   RTE_MBUF_DEFAULT_BUF_SIZE,
>>>                                   ctx->socket_id));
>>>
>>>
>>>        tlv_pool_element_sz = DWA_EXCEPTION_PACKETS_PKT_BURST_MAX_SZ * 
>>> sizeof(rte_mbuf *);
>>>        tlv_pool_element_sz  += 
>>> sizeof(rte_dwa_profile_l3fwd_d2h_exception_pkts);
>>>
>>>        /* Allocate TLV pool for 
>>> RTE_DWA_STLV_PROFILE_L3FWD_D2H_EXCEPTION_PACKETS_PACKETS tag */
>>>        ctx->tlv_pool = rte_mempool_create("TLV pool", /* mempool name */
>>>                                   ctx->tlv_pool_depth, /* Number of 
>>> elements*/
>>>                                   tlv_pool_element_sz, /* Element size*/
>>>                                   512, /* cache size*/
>>>                                   0, NULL, NULL, NULL /* Obj constructor 
>>> */, NULL,
>>>                                   ctx->socket_id, 0 /* flags *);
>>>
>>>
>>>        /* Configure Ethernet host port */
>>>        id = RTE_DWA_TLV_MK_ID(PORT_HOST_ETHERNET, H2D_CONFIG);
>>>        len = sizeof(struct rte_dwa_port_host_ethernet_config);
>>>        h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>>        cfg = h2d->msg;
>>>        /* Update the Ethernet configuration parameters */
>>>        cfg->nb_rx_queues = 1;
>>>        cfg->nb_tx_queues = 0;
>>>        cfg->max_burst = DWA_EXCEPTION_PACKETS_PKT_BURST_MAX_SZ;
>>>        cfg->pkt_pool = ctx->pkt_pool;
>>>        cfg->tlv_pool = ctx->tlv_pool;
>>>        rte_dwa_tlv_fill(h2d, id, len, h2d);
>>>        d2h = rte_dwa_ctrl_op(obj, h2d);
>>>        if (d2h == NULL))
>>>                goto fail;
>>>
>>>        free(h2d);
>>>
>>>        /* Configure Rx queue 0 receive expectation traffic */
>>>        id = RTE_DWA_TLV_MK_ID(PORT_HOST_ETHERNET, H2D_QUEUE_CONFIG);
>>>        len = sizeof(struct rte_dwa_port_host_ethernet_queue_config);
>>>        h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>>        cfg = h2d->msg;
>>>        cfg->id = 0; /* 0th Queue */
>>>        cfg->enable= 1;
>>>        cfg->is_tx = 0; /* Rx queue */
>>>        cfg->depth = ctx->rx_queue_depth;
>>>        rte_dwa_tlv_fill(h2d, id, len, h2d);
>>>        d2h = rte_dwa_ctrl_op(obj, h2d);
>>>        if (d2h == NULL))
>>>                goto fail;
>>>
>>>        free(h2d);
>>>
>>>        return true;
>>> fail:
>>>        if (h2d)
>>>                free(h2d);
>>>        return rc;
>>> }
>>>
>>> bool
>>> dwa_profile_l3fwd_config(rte_dwa_obj_t obj, struct app_ctx *ctx)
>>> {
>>>        struct rte_dwa_tlv info_h2d, *info_d2h = NULL, *h2d, *d2h = NULL;
>>>        struct rte_dwa_port_dwa_ethernet_d2h_info *info;
>>>        struct rte_dwa_profile_l3fwd_h2d_config *cfg;
>>>        bool rc = false;
>>>        uint32_t id;
>>>        size_t len;
>>>
>>>        /* Get DWA Ethernet port info */
>>>        id = RTE_DWA_TLV_MK_ID(PORT_DWA_ETHERNET, H2D_INFO);
>>>        rte_dwa_tlv_fill(&info_h2d, id, 0, NULL);
>>>        info_d2h = rte_dwa_ctrl_op(obj, &info_h2d);
>>>
>>>        info = rte_dwa_tlv_d2h_to_msg(info_d2h);
>>>        if (info == NULL)
>>>                goto fail;
>>>
>>>        /* Not found any DWA ethernet ports */
>>>        if (info->nb_ports == 0)
>>>                goto fail;
>>>
>>>        /* Configure L3FWD profile */
>>>        id = RTE_DWA_TLV_MK_ID(PROFILE_L3FWD, H2D_CONFIG);
>>>        len = sizeof(struct rte_dwa_profile_l3fwd_h2d_config) + 
>>> (sizeof(uint16_t) * info->nb_ports);
>>>        h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>>        cfg = h2d->msg;
>>>        /* Update the L3FWD configuration parameters */
>>>        cfg->mode = ctx->mode;
>>>        /* Attach all DWA Ethernet ports onto L3FWD profile */
>>>        cfg->nb_eth_ports = info->nb_ports;
>>>        memcpy(cfg->eth_ports, info->avail_ports, sizeof(uint16_t) * 
>>> info->nb_ports);
>>>
>>>        rte_dwa_tlv_fill(h2d, id, len, h2d);
>>>        d2h = rte_dwa_ctrl_op(obj, h2d);
>>>        free(h2d);
>>>
>>>        /* All good */
>>>        rc = true;
>>> fail:
>>>        if (info_d2h)
>>>                free(obj, info_d2h);
>>>        if (d2h)
>>>                free(obj, d2h);
>>>
>>>        return rc;
>>> }
>>>
>>> bool
>>> dwa_profile_l3fwd_has_capa(rte_dwa_obj_t obj, struct app_ctx *ctx)
>>> {
>>>        struct rte_dwa_profile_l3fwd_d2h_info *info;
>>>        struct rte_dwa_tlv h2d, *d2h;
>>>        bool found = false;
>>>        uint32_t id;
>>>
>>>        /* Get L3FWD profile info */
>>>        id = RTE_DWA_TLV_MK_ID(PROFILE_L3FWD, H2D_INFO);
>>>        rte_dwa_tlv_fill(&h2d, id, 0, NULL);
>>>        d2h = rte_dwa_ctrl_op(obj, &h2d);
>>>
>>>        info = rte_dwa_tlv_d2h_to_msg(d2h);
>>>        /* Request failed */
>>>        if (info == NULL)
>>>                goto fail;
>>>        /* Required lookup modes is not supported */
>>>        if (!(info->modes_supported & ctx->mode))
>>>                goto fail;
>>>
>>>        /* Check profile supports HOST_ETHERNET port as this application
>>>            * supports only host port as Ethernet
>>>            */
>>>        for (i = 0; i < info->nb_host_ports; i++) {
>>>                if (info->host_ports[i] == RTE_DWA_TAG_PORT_HOST_ETHERNET); {
>>>                        found = true;
>>>                }
>>>        }
>>>
>>>        /* Done with response, Free the d2h memory allocated by 
>>> implementation */
>>>        free(obj, d2h);
>>> fail:
>>>        return found;
>>> }
>>>
>>>
>>> bool
>>> dwa_has_profile(enum rte_dwa_tag_profile pf)
>>> {
>>>        enum rte_dwa_tlv_profile *pfs = NULL;
>>>        bool found = false;
>>>        int nb_pfs;
>>>
>>>        /* Get the number of profiles on the DWA device */
>>>        nb_pfs = rte_dwa_dev_disc_profiles(0, NULL);
>>>        pfs = malloc(sizeof(enum rte_dwa_tag_profile)  * nb_pfs);
>>>        /* Fetch all the profiles */
>>>        nb_pfs = rte_dwa_dev_disc_profiles(0, pfs);
>>>
>>>        /* Check the list has requested profile */
>>>        for (i = 0; i < nb_pfs; i++) {
>>>                if (pfs[i] == pf);
>>>                        found = true;
>>>        }
>>>        free(pfs);
>>>
>>>
>>>        return found;
>>> }
>>>
>>>
>>> #include <rte_dwa.h>
>>>
>>> #define DWA_EXCEPTION_PACKETS_PKT_BURST_MAX_SZ                32
>>>
>>> struct app_ctx {
>>>        bool done;
>>>        struct rte_mempool *pkt_pool;
>>>        struct rte_mempool *tlv_pool;
>>>        enum rte_dwa_profile_l3fwd_lookup_mode mode;
>>>        int socket_id;
>>>        int pkt_pool_depth;
>>>        int tlv_pool_depth;
>>>        int rx_queue_depth;
>>> } __rte_cache_aligned;
>>>
>>> int
>>> main(int argc, char **argv)
>>> {
>>>        rte_dwa_obj_t obj = NULL;
>>>        struct app_ctx ctx;
>>>        int rc;
>>>
>>>        /* Initialize EAL */
>>>        rc= rte_eal_init(argc, argv);
>>>           if (rc < 0)
>>>                 rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
>>>           argc -= ret;
>>>           argv += ret;
>>>
>>>
>>>        memset(&ctx, 0, sizeof(ctx));
>>>        /* Set application default values */
>>>        ctx->mode = RTE_DWA_PROFILE_L3FWD_MODE_LPM;
>>>        ctx->socket_id = SOCKET_ID_ANY;
>>>        ctx->pkt_pool_depth = 10000;
>>>        ctx->tlv_pool_depth = 10000;
>>>        ctx->rx_queue_depth = 10000;
>>>
>>>        /* Step 1: Check any DWA devices present  */
>>>        rc = rte_dwa_dev_count();
>>>        if (rc <= 0)
>>>                rte_exit(EXIT_FAILURE, "Failed to find DWA devices\n");
>>>
>>>        /* Step 2: Check DWA device has L3FWD profile or not */
>>>        if (!dwa_has_profile(RTE_DWA_TAG_PROFILE_L3FWD))
>>>                rte_exit(EXIT_FAILURE, "L3FWD profile not found\n");
>>>
>>>        /*
>>>         * Step 3: Now that, workload accelerator has L3FWD profile,
>>>         * offload L3FWD workload to accelerator by attaching the profile
>>>         * to accelerator.
>>>         */
>>>        enum rte_dwa_tlv_profile profile[] = {RTE_DWA_TAG_PROFILE_L3FWD};
>>>        obj = rte_dwa_dev_attach(0, "my_custom_accelerator_device", profile, 
>>> 1).;
>>>
>>>        /* Step 4: Check Attached L3FWD profile has required capability to 
>>> proceed */
>>>        if (!dwa_profile_l3fwd_has_capa(obj, &ctx))
>>>                rte_exit(EXIT_FAILURE, "L3FWD profile does not have enough 
>>> capability \n");
>>>
>>>        /* Step 5: Configure l3fwd profile */
>>>        if (!dwa_profile_l3fwd_config(obj, &ctx))
>>>                rte_exit(EXIT_FAILURE, "L3FWD profile configure failed \n");
>>>
>>>        /* Step 6: Configure ethernet host port to receive exception packets 
>>> */
>>>        if (!dwa_port_host_ethernet_config(obj, &ctx))
>>>                rte_exit(EXIT_FAILURE, "L3FWD profile configure failed \n");
>>>
>>>        /* Step 7 : Move DWA profiles to start state */
>>>        rte_dwa_start(obj);
>>>
>>>        /* Step 8: Handle expectation packets and add lookup rules for it */
>>>        dwa_profile_l3fwd_port_host_ethernet_worker(obj, &ctx);
>>>
>>>        /* Step 9: Clean up */
>>>        rte_dwa_stop(obj);
>>>        rte_dwa_dev_detach(0, obj);
>>>        rte_dwa_dev_close(0);
>>>
>>>        return 0;
>>> }
>>>
>>>
>>> Jerin Jacob (1):
>>>     dwa: introduce dataplane workload accelerator subsystem
>>>
>>>    doc/api/doxy-api-index.md            |  13 +
>>>    doc/api/doxy-api.conf.in             |   1 +
>>>    lib/dwa/dwa.c                        |   7 +
>>>    lib/dwa/meson.build                  |  17 ++
>>>    lib/dwa/rte_dwa.h                    | 184 +++++++++++++
>>>    lib/dwa/rte_dwa_core.h               | 264 +++++++++++++++++++
>>>    lib/dwa/rte_dwa_dev.h                | 154 +++++++++++
>>>    lib/dwa/rte_dwa_port_dwa_ethernet.h  |  68 +++++
>>>    lib/dwa/rte_dwa_port_host_ethernet.h | 178 +++++++++++++
>>>    lib/dwa/rte_dwa_profile_admin.h      |  85 ++++++
>>>    lib/dwa/rte_dwa_profile_l3fwd.h      | 378 +++++++++++++++++++++++++++
>>>    lib/dwa/version.map                  |   3 +
>>>    lib/meson.build                      |   1 +
>>>    13 files changed, 1353 insertions(+)
>>>    create mode 100644 lib/dwa/dwa.c
>>>    create mode 100644 lib/dwa/meson.build
>>>    create mode 100644 lib/dwa/rte_dwa.h
>>>    create mode 100644 lib/dwa/rte_dwa_core.h
>>>    create mode 100644 lib/dwa/rte_dwa_dev.h
>>>    create mode 100644 lib/dwa/rte_dwa_port_dwa_ethernet.h
>>>    create mode 100644 lib/dwa/rte_dwa_port_host_ethernet.h
>>>    create mode 100644 lib/dwa/rte_dwa_profile_admin.h
>>>    create mode 100644 lib/dwa/rte_dwa_profile_l3fwd.h
>>>    create mode 100644 lib/dwa/version.map
>>>

Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library

Reply via email to