[dpdk-dev] [RFC] Accelerator API to chain packet processing functions

David Coyle Tue, 04 Feb 2020 06:45:55 -0800

Introduction
============

This RFC introduces a new DPDK library, rte_accelerator.


The main aim of this library is to provide a flexible and extensible way of 
combining one or more packet-processing functions into a single operation, 
thereby allowing these to be performed in parallel in optimized software 
libraries or in a hardware accelerator. These functions can include 
cryptography, compression and CRC/checksum calculation, while others can 
potentially be added in the future. Performing these functions in parallel as a 
single operation can enable a significant performance improvement.


Background
==========

There are a number of byte-wise operations which are present and common across 
many access network data-plane pipelines, such as Cipher, Authentication, CRC, 
Bit-Interleaved-Parity (BIP), other checksums etc. Some prototyping has been 
done at Intel in relation to the 01.org access-network-dataplanes project to 
prove that a significant performance improvement is possible when such 
byte-wise operations are combined into a single pass of packet data processing. 
This performance boost has been prototyped for both XGS-PON MAC data-plane and 
DOCSIS MAC data-plane pipelines.

The prototypes used some protocol-specific modifications to the DPDK cryptodev 
library. In order to make this performance improvement consumable by network 
access equipment vendors, a more extensible and correct solution is required 
that can be upstreamed into DPDK.

Hence, the introduction of rte_accelerator.


Use Cases
=========

The primary use cases for this new library have already been mentioned. These 
are:

- DOCSIS MAC: Crypto-CRC
        - Order:
                - Downstream: CRC, Encrypt
                - Upstream: Decrypt, CRC
        - Specifications:
                - Crypto: 128-bit AES-CFB encryption variant for DOCSIS as 
described in section 11.1 of DOCSIS 3.1 Security Specification 
(https://apps.cablelabs.com/specification/CM-SP-SECv3.1)
                - CRC: Ethernet 32-bit CRC as defined in Ethernet/[ISO/IEC 
8802-3]

- XGS-PON MAC: Crypto-CRC-BIP
        - Order:
                - Downstream: CRC, Encrypt, BIP
                - Upstream: BIP, Decrypt, CRC
        - Specifications:
                - Crypto: AES-128 [NIST FIPS-197] cipher, used in counter mode 
(AES-CTR), as described in [NIST SP800-38A].
                - CRC: Ethernet 32-bit CRC as defined in Ethernet/[ISO/IEC 
8802-3]
                - BIP: 4-byte bit-interleaved even parity (BIP) field computed 
over the entire FS frame, refer to  ITU-T G.989.3, sections 8.1.1.5 and 8.1.2.3 
(https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-G.989.3-201510-I!!PDF-E)

Note that support for both these chained operations is already available in the 
Intel IPSec Multi-Buffer library.

However, it is not limited to these. The following are some of the other 
possible use-cases, which rte_accelerator will allow for:

- Storage:
        - Compression followed by Encryption
- IPSec over UDP:
        - UDP Checksum calculation followed by Encryption

While DPDK's rte_cryptodev and rte_compressdev allow many cryptographic and 
compression algorithms to be chained together in one operation, there is no way 
to chain these with any error detection or checksum algorithms. And there is no 
way to chain crypto and compression algorithms together. rte_accelerator will 
allow these chains to be created, and also allow any future type of operation 
to be easily added.


Architecture
============

The following diagram shows where rte_accelerator fits in an overall 
application architecture.

As can be seen from the diagram, the rte_accelerator API will depend on 
existing DPDK device libraries (i.e. rte_cryptodev and rte_compressdev) for 
existing features like crypto and compression. However, any new 
services/functions not covered in the existing libraries, such as CRC and BIP, 
will be provided directly by the rte_accelerator API:

    
+-------------------------------------------------------------------------------------+
    |                                                                           
          |
    |                                   Application                             
          |
    |                     (e.g. vCMTS (DOCSIS), vOLT (XGS-PON), etc.)           
          |
    |                                                                           
          |
    
+-------------------------------------------------------------------------------------+
                      |                |                                 |
    
+-----------------|----------------|---------------------------------|----------------+
    |                 |                |   DPDK                          |      
          |
    |                 |                |                                 |      
          |
    |                 |                | Session creation/deletion       |      
          |
    |                 |                ~ Operation creation/deletion     |      
          |
    |    Device       |                | Operation enqueue/dequeue       |    
Device      |
    | initialization, |                |                                 | 
initialization,|
    | configuration,  ~     +--------------------------------------+     ~ 
configuration, |
    |    reset        |     |  rte_accelerator                     |     |    
reset       |
    |                 |     |   |-- err_detect xform/op            |     |      
          |
    |                 |     |   |-- Session creation/deletion fns  |     |      
          |
    |                 |     |   `-- Operation enqueue/dequeue fns  |     |      
          |
    |                 |     +--------------------------------------+     |      
          |
    |                 |         /       /      \             \           |      
          |
    |                 |        /       /        \             \          |      
          |
    |                 |       /       |          |             \         |      
          |
    |  +------------------------+     |          |     
+------------------------+         |
    |  |  rte_cryptodev         |     |          |     |  rte_compressdev       
|         |
    |  |   |-- sym xforms/ops   |     |          |     |   `-- comp xforms/ops  
|   ...   |
    |  |   `-- asym xforms/ops  |     |          |     |                        
|         |
    |  +------------------------+     |          |     
+------------------------+         |
    |                  \     \        /          |             /                
          |
    |                   \     \______/_________   \      _____/                 
          |
    |                    \          /          \   \    /                       
          |
    |                   +------------+       +------------+                     
          |
    |                   |            |       |            |                     
          |
    |                   |  AESNI-MB  |       |    QAT     |      ...            
          |
    |                   |    PMD     |       |    PMD     |                     
          |
    |                   |            |       |            |                     
          |
    |                   +------------+       +------------+                     
          |
    |                          |                    |                           
          |
    
+--------------------------|--------------------|-------------------------------------+
                               |                    |
                        +------------+       +------------+
                        |            |       |            |
                        |  AESNI-MB  |       |   QAT HW   |
                        |   SW LIB   |       |            |
                        |            |       |            |
                        +------------+       +------------+

Some key points are:
1) rte_accelerator will use xform and operation related definitions (i.e. 
structs, enums, defines) from existing the device libraries, such as 
rte_cryptodev and rte_compressdev, to create accelerator xform chains, sessions 
and operation chains
   a) this allows as much re-use of existing definitions as possible
2) The application code will use the existing device library functions to 
initialize, configure, start, stop and reset the device
3) The application code will call rte_accelerator to create/delete accelerator 
sessions and enqueue/dequeue accelerator operations
   a) Each device PMD will register functions with rte_accelerator to perform 
these tasks
4) rte_accelerator will provide definitions/structs/etc. for the error 
detection xform and operation
   a) the error detection 'algorithms' initially supported will include CRC32 
and BIP32, but others can be added in the future as needed
   b) other xform and operation types which do not fit in any existing device 
libraries, such as rte_cryptodev or rte_compressdev, but need to be accelerated 
can also be added under rte_accelerator in the future
   c) note, however, that error detection and other xform and operation types 
under rte_accelerator could be moved to an independent library in the future, 
if required
5) rte_accelerator will not provide a capability check feature. Instead, 
rte_accelerator will return an error code on session creation if the xform 
chain specified is not supported by the underlying device
6) rte_accelerator does not support session-less mode of operation
   a) sessions MUST always be created and attached to the operations being 
enqueued
7) Initially, support will be added to existing AESNI-MB and QAT PMDs for 
rte_accelerator functionality. Support can be added to other PMDs in the future 
as needed
   a) To add support for rte_accelerator functionality, the PMD must implement 
and register the required callback functions mentioned in 3a) above


Key API Definitions
===================

The full proposed rte_accelerator API is provided at the end of the RFC.
Here, some of the key structures and functions are described.

The following structure defines an accelerator xform
- Accelerator xforms are chained together through the next field
- The accelerator xform can contain a crypto (symmetric or asymmetric), 
compression or error detection xform. Others can be added in the future as 
needed
- The order of xforms in the chain specifies the order in which those 
operations should be performed on the packet data

/**
 * Accelerator transform setup data
 *
 * This structure is used to specify the accelerator transforms required.
 * Multiple transforms can be chained together to specify a chain of transforms
 * such as symmetric crypto followed by error detection, or compression followed
 * by symmetric crypto. Each transform structure holds a single transform, with
 * the type field specifying which transform is contained within the union.
 */
struct rte_accelerator_xform {
        struct rte_accelerator_xform *next;
        /**<
         * Next transform in the chain
         * - the last transform in the chain MUST set this to NULL
         */
        enum rte_accelerator_xform_type type;
        /**< Transform type */

        RTE_STD_C11
        union {
                struct rte_crypto_sym_xform crypto_sym;
                /**< Symmetric crypto transform */
                struct rte_crypto_asym_xform crypto_asym;
                /**< Asymmetric crypto transform */
                struct rte_comp_xform comp;
                /**< Compression transform */
                struct rte_err_detect_xform err_detect;
                /**< Error detection transform */
        };
};

The following structure defines an accelerator operation
- Accelerator operations are chained together through the next field
- The order of operations in this chain MUST match the order of xforms in the 
session's xform chain
- Any additional operation data (e.g. IV data) must follow immediately after 
this struct in memory
- The following fields MUST be set in the FIRST operation of the chain before 
enqueuing. These fields are ignored in the inner op structures and any 
subsequent rte_accelerator_op chain elements:
        - sess
        - m_src
        - m_dst
- The following fields MUST be set in ALL operations in a chain before 
enqueuing:
        - next
        - mempool
        - type
- After dequeuing, only the first operation in the chain will contain the 
overall status in the overall_status field and each chain element will contain 
it's individual status in the op_status field

/**
 * Accelerator operation data
 *
 * This structure is used to specify the operations for a particular session.
 * This includes specifying the source and, if required, destination mbufs and
 * the lengths and offsets of the data within these mbufs on which the
 * operations should be done. Multiple operations are chained together to
 * specify the full set of operations to be performed
 */
struct rte_accelerator_op {
        struct rte_accelerator_op *next;
        /**<
         * Next operation in the chain
         * - the last operation in the chain MUST set this to NULL
         */
        struct rte_accelerator_session *sess;
        /**< Handle for the associated accelerator session */

        struct rte_mempool *mempool;
        /**< Mempool from which the operation is allocated */

        struct rte_mbuf *m_src; /**< Source mbuf */
        struct rte_mbuf *m_dst; /**< Destination mbuf */

        enum rte_accelerator_op_status overall_status;
        /**<
         * Overall operation status
         * - indicates if all the operations in the chain succeeded or if any
         *   one of them failed
         */

        uint8_t op_status;
        /**<
         * Individual operation status
         * - indicates the status of the individual operation in the chain
         */

        enum rte_accelerator_op_type type;
        /**< Operation type */

        RTE_STD_C11
        union {
                struct rte_crypto_sym_op crypto_sym;
                /**< Symmetric crypto operation */
                struct rte_crypto_asym_op crypto_asym;
                /**< Asymmetric crypto operation */
                struct rte_comp_op comp;
                /**< Compression operation */
                struct rte_err_detect_op err_detect;
                /**< Error detection operation */
        };
};

The accelerator API is mainly a device-based API, with existing device PMDs 
extended to provide support for the API. When a device that supports the 
accelerator API initialises, it must setup it's accelerator context in it's 
private device data. The following structure defines an accelerator context - 
it contains a pointer to the device itself, a pointer to a structure containing 
the device's callback functions for session creation/deletion, operation 
enqueue/dequeue etc. and a count of the number of sessions attached to this 
context.

/**
 * Accelerator context for a device
 *
 * Accelerator instance for each driver to register their accelerator
 * operations. The application can get the accelerator context from the
 * underlying device using the API functions provided by the device's main API
 * library (e.g. rte_cryptodev_get_accelerator_ctx() for crypto devices,
 * rte_compressdev_get_accelerator_ctx() for compress devices, etc.).
 */
struct rte_accelerator_ctx {
        void *device;
        /**< Pointer to the device */
        const struct rte_accelerator_ops *ops;
        /**< Pointer to accelerator ops for the device */
        uint16_t sess_cnt;
        /**< Number of sessions attached to this context */
};

As the comment on struct rte_accelerator_ctx mentions, rte_cryptodev, 
rte_compressdev and any other device library which supports the rte_accelerator 
API in the future must provide API functions to get the accelerator context for 
a device based on a device id. In the cases of rte_cryptodev and 
rte_compressdev, these functions will have the following prototypes:

void *
rte_cryptodev_get_accelerator_ctx(uint8_t dev_id);

void *
rte_compressdev_get_accelerator_ctx(uint8_t dev_id);

An application will call these functions to get the device accelerator context, 
and then use this context when it wants to interact with the device for 
accelerator functionality. The following are some of the main accelerator API 
functions provided. For full details of the parameters and return values, see 
the full API at the end of the RFC

        - The following functions are used to create and destroy an accelerator 
session on a device:

                struct rte_accelerator_session *
                rte_accelerator_session_create(struct rte_accelerator_ctx *ctx,
                                               struct rte_accelerator_xform 
*xform,
                                               int socket_id);

                int
                rte_accelerator_session_destroy(struct rte_accelerator_ctx *ctx,
                                                struct rte_accelerator_session 
*sess);

        - The following functions are used to enqueue and dequeue accelerator 
operations to/from a device
                - The qp_id parameter specifies the queue pair of the device on 
which to enqueue/dequeue
                - It is the responsibility of the application to manage the 
queue pair assignments within the application (e.g. the same queue pair should 
not be used for accelerator enqueue/dequeue and cryptodev enqueue/dequeue)

                uint16_t
                rte_accelerator_ops_enqueue(struct rte_accelerator_ctx *ctx,
                                            uint16_t qp_id,
                                            struct rte_accelerator_op **ops,
                                            uint16_t nb_ops);

                uint16_t
                rte_accelerator_ops_dequeue(struct rte_accelerator_ctx *ctx,
                                            uint16_t qp_id,
                                            struct rte_accelerator_op **ops,
                                            uint16_t nb_ops);


Full API
========

The following is the full proposed API for the rte_accelerator library. There 
are some minor updates to the rte_cryptodev and rte_compressdev API which are 
also listed.

diff --git a/lib/librte_accelerator/rte_accelerator.h 
b/lib/librte_accelerator/rte_accelerator.h
new file mode 100644
index 0000000..dcf292b
--- /dev/null
+++ b/lib/librte_accelerator/rte_accelerator.h
@@ -0,0 +1,336 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation.
+ */
+
+#ifndef _RTE_ACCELERATOR_H_
+#define _RTE_ACCELERATOR_H_
+
+/**
+ * @file rte_accelerator.h
+ *
+ * RTE Accelerator Common Definitions
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_compat.h>
+#include <rte_common.h>
+#include <rte_mbuf.h>
+#include <rte_memory.h>
+#include <rte_mempool.h>
+#include <rte_comp.h>
+#include <rte_crypto.h>
+
+#include "rte_err_detect.h"
+
+/**
+ * Accelerator transform types
+ */
+enum rte_accelerator_xform_type {
+       RTE_ACC_XFORM_TYPE_CRYPTO_SYM,
+       /**< Symmetric crypto transform type */
+       RTE_ACC_XFORM_TYPE_CRYPTO_ASYM,
+       /**< Asymmetric crypto transform type */
+       RTE_ACC_XFORM_TYPE_COMP,
+       /**< Compression transform type */
+       RTE_ACC_XFORM_TYPE_ERR_DETECT,
+       /**< Error detection transform type */
+};
+
+/**
+ * Accelerator transform setup data
+ *
+ * This structure is used to specify the accelerator transforms required.
+ * Multiple transforms can be chained together to specify a chain of transforms
+ * such as symmetric crypto followed by error detection, or compression 
followed
+ * by symmetric crypto. Each transform structure holds a single transform, with
+ * the type field specifying which transform is contained within the union.
+ */
+struct rte_accelerator_xform {
+       struct rte_accelerator_xform *next;
+       /**<
+        * Next transform in the chain
+        * - the last transform in the chain MUST set this to NULL
+        */
+       enum rte_accelerator_xform_type type;
+       /**< Transform type */
+
+       RTE_STD_C11
+       union {
+               struct rte_crypto_sym_xform crypto_sym;
+               /**< Symmetric crypto transform */
+               struct rte_crypto_asym_xform crypto_asym;
+               /**< Asymmetric crypto transform */
+               struct rte_comp_xform comp;
+               /**< Compression transform */
+               struct rte_err_detect_xform err_detect;
+               /**< Error detection transform */
+       };
+};
+
+/**
+ * Accelerator operation types
+ *
+ * Each value must be a power of 2 so that the operations can be combined into
+ * a bitmask (see rte_accelerator_op_pool_create())
+ */
+enum rte_accelerator_op_type {
+       RTE_ACCELERATOR_OP_TYPE_CRYPTO_SYM  = (0x1 << 0),
+       /**< Symmetric crypto operation type */
+       RTE_ACCELERATOR_OP_TYPE_CRYPTO_ASYM = (0x1 << 1),
+       /**< Asymmetric crypto operation type */
+       RTE_ACCELERATOR_OP_TYPE_COMP        = (0x1 << 2),
+       /**< Compression operation type */
+       RTE_ACCELERATOR_OP_TYPE_ERR_DETECT  = (0x1 << 3),
+       /**< Error detection operation type */
+};
+
+/**
+ * Accelerator operation status
+ */
+enum rte_accelerator_op_status {
+       RTE_ACCELERATOR_OP_STATUS_NOT_PROCESSED = 0,
+       /**< Operation has not yet been processed by a device */
+       RTE_ERR_DETECT_OP_STATUS_SUCCESS,
+       /**< Operation completed successfully */
+       RTE_ERR_DETECT_OP_STATUS_FAILURE,
+       /**< Operation completed with failure */
+};
+
+/**
+ * Accelerator operation data
+ *
+ * This structure is used to specify the operations for a particular session.
+ * This includes specifying the source and, if required, destination mbufs and
+ * the lengths and offsets of the data within these mbufs on which the
+ * operations should be done. Multiple operations are chained together to
+ * specify the full set of operations to be performed
+ *
+ * @note The rte_accelerator_op chain MUST match the session's xform
+ * chain exactly
+ * @note The first rte_accelerator_op element in the chain is the parent
+ * operation. The following fields MUST be set in this first operation before
+ * enqueuing and are ignored in the inner operations and any subsequent
+ * rte_accelerator_op chain elements:
+ * - *sess*
+ * - *m_src*
+ * - *m_dst* (if required)
+ * @note If *sess* or *m_src* is not set in the first rte_accelerator_op, this
+ * operation is invalid and will cause an error when attempting to enqueue.
+ * @note The following fields MUST be set in ALL rte_accelerator_op chain
+ * elements:
+ * - *next*
+ * - *mempool*
+ * - *type*
+ * @note After the operation has been dequeued, only the FIRST (i.e. the 
parent)
+ * rte_accelerator_op in the chain will contain the *overall_status*. Each
+ * chain element will contain it's individual *op_status*, the value of which 
is
+ * relevant to operation type (e.g. an ::rte_crypto_op_status,
+ * ::rte_comp_op_status or ::rte_err_detect_op_status)
+ */
+struct rte_accelerator_op {
+       struct rte_accelerator_op *next;
+       /**<
+        * Next operation in the chain
+        * - the last operation in the chain MUST set this to NULL
+        */
+       struct rte_accelerator_session *sess;
+       /**< Handle for the associated accelerator session */
+
+       struct rte_mempool *mempool;
+       /**< Mempool from which the operation is allocated */
+
+       struct rte_mbuf *m_src; /**< Source mbuf */
+       struct rte_mbuf *m_dst; /**< Destination mbuf */
+
+       enum rte_accelerator_op_status overall_status;
+       /**<
+        * Overall operation status
+        * - indicates if all the operations in the chain succeeded or if any
+        *   one of them failed
+        */
+
+       uint8_t op_status;
+       /**<
+        * Individual operation status
+        * - indicates the status of the individual operation in the chain
+        */
+
+       enum rte_accelerator_op_type type;
+       /**< Operation type */
+
+       RTE_STD_C11
+       union {
+               struct rte_crypto_sym_op crypto_sym;
+               /**< Symmetric crypto operation */
+               struct rte_crypto_asym_op crypto_asym;
+               /**< Asymmetric crypto operation */
+               struct rte_comp_op comp;
+               /**< Compression operation */
+               struct rte_err_detect_op err_detect;
+               /**< Error detection operation */
+       };
+};
+
+/**
+ * Accelerator context for a device
+ *
+ * Accelerator instance for each device driver to register their accelerator
+ * operations. The application can get the accelerator context from the
+ * underlying device using the API functions provided by the device's main API
+ * library (e.g. rte_cryptodev_get_accelerator_ctx() for crypto devices,
+ * rte_compressdev_get_accelerator_ctx() for compress devices, etc.).
+ */
+struct rte_accelerator_ctx;
+
+/**
+ * Accelerator session data
+ */
+struct rte_accelerator_session;
+
+/**
+ * Create accelerator session as specified by the transform chain
+ *
+ * @param   ctx                Accelerator device instance
+ * @param   xform      Pointer to the first element of the session transform
+ *                     chain
+ * @param   socket_id  Socket to allocate the session on
+ *
+ * @return
+ *  - Pointer to session, if successful
+ *  - NULL, on failure
+ */
+__rte_experimental
+struct rte_accelerator_session *
+rte_accelerator_session_create(struct rte_accelerator_ctx *ctx,
+                              struct rte_accelerator_xform *xform,
+                              int socket_id);
+
+/**
+ * Free accelerator session header and the session private data and return it
+ * to its original mempool
+ *
+ * @param   ctx                Accelerator device instance
+ * @param   sess       Accelerator session to be freed
+ *
+ * @return
+ *  - 0, if successful
+ *  - -EINVAL, if session is NULL
+ *  - -EBUSY, if not all device private data has been freed
+ */
+__rte_experimental
+int
+rte_accelerator_session_destroy(struct rte_accelerator_ctx *ctx,
+                               struct rte_accelerator_session *sess);
+
+/**
+ * Creates an accelerator operation pool
+ *
+ * @param   name       Pool name
+ * @param   op_types   Bitmask of operations which this pool must support. This
+ *                     bitmask allows this function determine the maximum size
+ *                     of operation which must be accommodated in the mempool
+ *                     elements. See ::rte_accelerator_op_type for possible
+ *                     bitmask values
+ * @param   nb_elts    Number of elements in the pool
+ * @param   cache_size Number of elements to cache on lcore, see
+ *                     rte_mempool_create() for further details about cache
+ *                     size
+ * @param   priv_size  Size of private data to allocate with each operation
+ * @param   socket_id  Socket to allocate the mempool on
+ *
+ * @return
+ *  - Pointer to mempool, if successful
+ *  - NULL, on failure
+ */
+__rte_experimental
+struct rte_mempool *
+rte_accelerator_op_pool_create(const char *name,
+                              uint32_t op_types,
+                              unsigned nb_elts,
+                              unsigned cache_size,
+                              uint16_t priv_size,
+                              int socket_id);
+
+/**
+ * Bulk allocate accelerator operations from a mempool
+ *
+ * @param   mempool    Accelerator operation mempool
+ * @param   ops                Array in which to place allocated accelerator 
operations
+ * @param   nb_ops     Number of accelerator operations to allocate
+ *
+ * @returns
+ *  - *nb_ops*, if the number of operations requested were allocated
+ *  - 0, if the requested number of ops are not available. None are allocated 
in
+ *    this case
+ */
+__rte_experimental
+uint16_t
+rte_accelerator_op_bulk_alloc(struct rte_mempool *mempool,
+                             struct rte_accelerator_op **ops,
+                             uint16_t nb_ops);
+
+/**
+ * Free accelerator operation back to it's mempool
+ *
+ * @param   op         Accelerator operation
+ */
+__rte_experimental
+void
+rte_accelerator_op_free(struct rte_accelerator_op *op);
+
+/**
+ * Enqueue a burst of operations for processing on the specified queue pair
+ * of a device for processing
+ *
+ * @param   ctx                Accelerator device instance
+ * @param   qp_id      Index of the device's queue pair to which operations
+ *                     are to be enqueued for processing
+ * @param   ops                Array of *nb_ops* pointers to accelerator 
operations
+ *                     to be processed
+ * @param   nb_ops     Number of operations to process
+ *
+ * @return
+ *  The number of operations actually enqueued on the device. A return value
+ *  equal to *nb_ops* means that all operations have been enqueued. The return
+ *  value can be less than *nb_ops* when the device's queue is full or if
+ *  invalid parameters are specified in an rte_accelerator_op
+ */
+__rte_experimental
+uint16_t
+rte_accelerator_ops_enqueue(struct rte_accelerator_ctx *ctx,
+                           uint16_t qp_id,
+                           struct rte_accelerator_op **ops,
+                           uint16_t nb_ops);
+
+/**
+ * Dequeue a burst of processed operations from a queue on the specified 
device.
+ * The dequeued operation are stored in rte_accelerator_op structures whose
+ * pointers are supplied in the *ops* array
+ *
+ * @param   ctx                Accelerator device instance
+ * @param   qp_id      Index of the device's queue pair from which processed
+ *                     operation should be retrieved
+ * @param   ops                Array of pointers to rte_accelerator_op 
structures
+ *                     that must be large enough to store *nb_ops* pointers in
+ *                     it
+ * @param   nb_ops     Maximum number of operations to dequeue
+ *
+ * @return
+ *  The number of operations actually dequeued, which is the number of pointers
+ *  to rte_accelerator_op structures effectively supplied in the *ops* array
+ */
+__rte_experimental
+uint16_t
+rte_accelerator_ops_dequeue(struct rte_accelerator_ctx *ctx,
+                           uint16_t qp_id,
+                           struct rte_accelerator_op **ops,
+                           uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACCELERATOR_H_ */
diff --git a/lib/librte_accelerator/rte_accelerator_driver.h 
b/lib/librte_accelerator/rte_accelerator_driver.h
new file mode 100644
index 0000000..49cb902
--- /dev/null
+++ b/lib/librte_accelerator/rte_accelerator_driver.h
@@ -0,0 +1,146 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation.
+ */
+
+#ifndef _RTE_ACCELERATOR_DRIVER_H_
+#define _RTE_ACCELERATOR_DRIVER_H_
+
+/**
+ * @file rte_accelerator_driver.h
+ *
+ * RTE Accelerator Driver Definitions
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "rte_accelerator.h"
+
+/**
+ * Accelerator context for a device
+ *
+ * Accelerator instance for each driver to register their accelerator
+ * operations. The application can get the accelerator context from the
+ * underlying device using the API functions provided by the device's main API
+ * library (e.g. rte_cryptodev_get_accelerator_ctx() for crypto devices,
+ * rte_compressdev_get_accelerator_ctx() for compress devices, etc.).
+ */
+struct rte_accelerator_ctx {
+       void *device;
+       /**< Pointer to the device */
+       const struct rte_accelerator_ops *ops;
+       /**< Pointer to accelerator ops for the device */
+       uint16_t sess_cnt;
+       /**< Number of sessions attached to this context */
+};
+
+/**
+ * Accelerator session data
+ */
+struct rte_accelerator_session {
+       void *sess_private_data;
+       /**< Device's private session data */
+};
+
+/**
+ * Configure private accelerator session data on a device
+ *
+ * @param   device     Device pointer
+ * @param   xform      Pointer to the first element of the session transform
+ *                     chain
+ * @param   sess       Accelerator session structure
+ * @param   socket_id  Socket to allocate the session on
+ *
+ * @return
+ *  - 0, if the private session structure have been created successfully
+ *  - -EINVAL, if the input parameters are invalid
+ *  - -ENOTSUP, if the device does not support the session configuration
+ *  - -ENOMEM, if memory for the session could not be allocated
+ */
+typedef int
+(*accelerator_session_create_t)(void *device,
+                               struct rte_accelerator_xform *xform,
+                               struct rte_accelerator_session *sess,
+                               int socket_id);
+
+/**
+ * Free private accelerator session data
+ *
+ * @param   device     Device pointer
+ * @param   sess       Accelerator session structure
+ */
+typedef void
+(*accelerator_session_destroy_t)(void *device,
+                                struct rte_accelerator_session *sess);
+
+/**
+ * Get the size of private accelerator session data
+ *
+ * @param   device     Device pointer
+ *
+ * @return
+ *  - Size of the private session structure for device, if successful
+ *  - 0, if failure
+ */
+typedef unsigned int
+(*accelerator_session_private_size_get_t)(void *device);
+
+/**
+ * Enqueue operations on queue pair of a device for processing
+ *
+ * @param   device     Device pointer
+ * @param   qp_id      Index of the device's queue pair to which operations
+ *                     are to be enqueued for processing
+ * @param   ops                Array of *nb_ops* pointers to operations to be 
enqueued
+ * @param   nb_ops     Number of operations to enqueue
+ *
+ * @return
+ *  The number of operations actually enqueued on the device
+ */
+typedef uint16_t
+(*accelerator_ops_enqueue_t)(void *device,
+                            uint16_t qp_id,
+                            struct rte_accelerator_op **ops,
+                            uint16_t nb_ops);
+
+/**
+ * Dequeue processed operations from a queue pair of a device
+ *
+ * @param   device     Device pointer
+ * @param   qp_id      Index of the device's queue pair from which processed
+ *                     operations should be dequeued
+ * @param   ops                Array of pointers to rte_accelerator_op 
structures
+ *                     that must be large enough to store *nb_ops* pointers in
+ *                     it
+ * @param   nb_ops     Maximum number of operations to dequeue
+ *
+ * @return
+ *  The number of operations actually dequeued from the device
+ */
+typedef uint16_t
+(*accelerator_ops_dequeue_t)(void *device,
+                            uint16_t qp_id,
+                            struct rte_accelerator_op **ops,
+                            uint16_t nb_ops);
+
+/** Accelerator device operations function pointer table */
+struct rte_accelerator_ops {
+       accelerator_session_create_t session_create;
+       /**< Configure an accelerator device's private session data */
+       accelerator_session_destroy_t session_destroy;
+       /**< Free an accelerator device's private session data */
+       accelerator_session_private_size_get_t session_private_size_get;
+       /**< Get the size of an accelerator device's private session data */
+       accelerator_ops_enqueue_t ops_enqueue;
+       /**< Enqueue a burst of operations to a device */
+       accelerator_ops_dequeue_t ops_dequeue;
+       /**< Dequeue a burst of operations from a device */
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACCELERATOR_DRIVER_H_ */
diff --git a/lib/librte_accelerator/rte_err_detect.h 
b/lib/librte_accelerator/rte_err_detect.h
new file mode 100644
index 0000000..f54ebfb
--- /dev/null
+++ b/lib/librte_accelerator/rte_err_detect.h
@@ -0,0 +1,109 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation.
+ */
+
+#ifndef _RTE_ERR_DETECT_H_
+#define _RTE_ERR_DETECT_H_
+
+/**
+ * @file rte_err_detect.h
+ *
+ * RTE Error Detection Definitions
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+/** Error Detection Algorithms */
+enum rte_err_detect_algorithm {
+       RTE_ERR_DETECT_CRC32_ETH,
+       /**< CRC32 Ethernet */
+       RTE_ERR_DETECT_BIP32
+       /**< BIP32 */
+};
+
+/** Error Detection Operation Types */
+enum rte_err_detect_operation {
+       RTE_ERR_DETECT_OP_VERIFY,
+       /**< Verify error detection result */
+       RTE_ERR_DETECY_OP_GENERATE
+       /**< Generate error detection result */
+};
+
+/** Error Detection Status */
+enum rte_err_detect_op_status {
+       RTE_ERR_DETECT_OP_STATUS_NOT_PROCESSED,
+       /**< Operation has not yet been processed by a device */
+       RTE_ERR_DETECT_OP_STATUS_SUCCESS,
+       /**< Operation completed successfully */
+       RTE_ERR_DETECT_OP_STATUS_VERIFY_FAILED,
+       /**< Verification failed */
+       RTE_ERR_DETECT_OP_STATUS_ERROR
+       /**< Error handling operation */
+};
+
+/**
+ * Error Detection Transform Data
+ *
+ * This structure contains data relating to an error detection transform. The
+ * fields *op* and *algo* are common to all error detection transforms and
+ * MUST be set
+ */
+struct rte_err_detect_xform {
+       enum rte_err_detect_operation op;
+       /**< Error detection operation type */
+       enum rte_err_detect_algorithm algo;
+       /**< Error detection algorithm */
+};
+
+/** Error Detection Operation */
+struct rte_err_detect_op {
+       struct rte_mbuf *m_src; /**< Source mbuf */
+
+       enum rte_err_detect_op_status status; /**< Operation status */
+
+       struct {
+               uint16_t offset;
+               /**<
+                * Starting point for error detection processing, specified
+                * as the number of bytes from start of the packet in the
+                * source mbuf
+                */
+               uint16_t length;
+               /**<
+                * The length, in bytes, of the source mbuf on which the error
+                * detection operation will be computed
+                */
+       } data; /**< Data offset and length for error detection */
+
+       struct {
+               uint8_t *data;
+               /**<
+                * This points to the location where the error detection
+                * result should be written (in the case of generation) or
+                * where the purported result exists (in the case of
+                * verification)
+                *
+                * The caller must ensure the required length of physically
+                * contiguous memory is available at this address
+                *
+                * For a CRC, this may point into the mbuf packet data. For
+                * an operation such as a BIP, this may point to a memory
+                * location after the op
+                *
+                * For generation, the result will overwrite any data at this
+                * location
+                */
+               rte_iova_t phys_addr;
+               /**< Physical address of output data */
+       } output; /**< Output location */
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ERR_DETECT_H_ */
diff --git a/lib/librte_compressdev/rte_compressdev.h 
b/lib/librte_compressdev/rte_compressdev.h
index 8052efe..9e81bb9 100644
--- a/lib/librte_compressdev/rte_compressdev.h
+++ b/lib/librte_compressdev/rte_compressdev.h
@@ -568,6 +568,21 @@ __rte_experimental
 int
 rte_compressdev_private_xform_free(uint8_t dev_id, void *private_xform);
 
+/**
+ * Get accelerator context for a device.
+ *
+ * @param dev_id
+ *   Compress device identifier
+ *
+ * @return
+ *  - Pointer to the device's accelerator context, if the device supports
+ *    the accelerator API
+ *  - NULL, otherwise
+ */
+__rte_experimental
+void *
+rte_compressdev_get_accelerator_ctx(uint8_t dev_id);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_compressdev/rte_compressdev_internal.h 
b/lib/librte_compressdev/rte_compressdev_internal.h
index 22ceac6..549526e 100644
--- a/lib/librte_compressdev/rte_compressdev_internal.h
+++ b/lib/librte_compressdev/rte_compressdev_internal.h
@@ -79,6 +79,9 @@ struct rte_compressdev {
        struct rte_device *device;
        /**< Backing device */
 
+       void *accelerator_ctx;
+        /**< Context for accelerator ops */
+
        __extension__
        uint8_t attached : 1;
        /**< Flag indicating the device is attached */
diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index c6ffa3b..7279f12 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -838,6 +838,9 @@ struct rte_cryptodev {
        void *security_ctx;
        /**< Context for security ops */
 
+       void *accelerator_ctx;
+       /**< Context for accelerator ops */
+
        __extension__
        uint8_t attached : 1;
        /**< Flag indicating the device is attached */
@@ -847,6 +850,20 @@ void *
 rte_cryptodev_get_sec_ctx(uint8_t dev_id);
 
 /**
+ * Get accelerator context for a device
+ *
+ * @param      dev_id          Device id.
+ *
+ * @return
+ *  - Pointer to the device's accelerator context, if the device supports
+ *    the accelerator API
+ *  - NULL, otherwise
+ */
+__rte_experimental
+void *
+rte_cryptodev_get_accelerator_ctx(uint8_t dev_id);
+
+/**
  *
  * The data part, with no function pointers, associated with each device.
  *

--------------------------------------------------------------
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263


This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.

[dpdk-dev] [RFC] Accelerator API to chain packet processing functions

Reply via email to