On Fri, Jul 2, 2021 at 6:51 PM Chengwen Feng <fengcheng...@huawei.com> wrote: > > This patch introduces 'dmadevice' which is a generic type of DMA > device. > > The APIs of dmadev library exposes some generic operations which can > enable configuration and I/O with the DMA devices. > > Signed-off-by: Chengwen Feng <fengcheng...@huawei.com>
Thanks for v1. I would suggest finalizing lib/dmadev/rte_dmadev.h before doing the implementation so that you don't need to waste time on rewoking the implementation. Comments inline. > --- > MAINTAINERS | 4 + > config/rte_config.h | 3 + > lib/dmadev/meson.build | 6 + > lib/dmadev/rte_dmadev.c | 438 +++++++++++++++++++++ > lib/dmadev/rte_dmadev.h | 919 > +++++++++++++++++++++++++++++++++++++++++++ > lib/dmadev/rte_dmadev_core.h | 98 +++++ > lib/dmadev/rte_dmadev_pmd.h | 210 ++++++++++ > lib/dmadev/version.map | 32 ++ Missed to update doxygen. See doc/api/doxy-api.conf.in Use meson -Denable_docs=true to verify the generated doxgen doc. > lib/meson.build | 1 + > 9 files changed, 1711 insertions(+) > create mode 100644 lib/dmadev/meson.build > create mode 100644 lib/dmadev/rte_dmadev.c > create mode 100644 lib/dmadev/rte_dmadev.h > create mode 100644 lib/dmadev/rte_dmadev_core.h > create mode 100644 lib/dmadev/rte_dmadev_pmd.h > create mode 100644 lib/dmadev/version.map > > diff --git a/MAINTAINERS b/MAINTAINERS > index 4347555..2019783 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -496,6 +496,10 @@ F: drivers/raw/skeleton/ > F: app/test/test_rawdev.c > F: doc/guides/prog_guide/rawdev.rst > Add EXPERIMENTAL > +Dma device API > +M: Chengwen Feng <fengcheng...@huawei.com> > +F: lib/dmadev/ > + > > new file mode 100644 > index 0000000..a94e839 > --- /dev/null > +++ b/lib/dmadev/rte_dmadev.c > @@ -0,0 +1,438 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright 2021 HiSilicon Limited. > + */ > + > +#include <ctype.h> > +#include <stdlib.h> > +#include <string.h> > +#include <stdint.h> > + > +#include <rte_log.h> > +#include <rte_debug.h> > +#include <rte_dev.h> > +#include <rte_memory.h> > +#include <rte_memzone.h> > +#include <rte_malloc.h> > +#include <rte_errno.h> > +#include <rte_string_fns.h> Sort in alphabetical order. > + > +#include "rte_dmadev.h" > +#include "rte_dmadev_pmd.h" > + > +struct rte_dmadev rte_dmadevices[RTE_DMADEV_MAX_DEVS]; # Please check have you missed any multiprocess angle. lib/regexdev/rte_regexdev.c is latest device class implemented in dpdk and please check *rte_regexdev_shared_data scheme. # Missing dynamic log for this library. > diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h > new file mode 100644 > index 0000000..f74fc6a > --- /dev/null > +++ b/lib/dmadev/rte_dmadev.h > @@ -0,0 +1,919 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright 2021 HiSilicon Limited. It would be nice to add other companies' names who have contributed to the specification. > + */ > + > +#ifndef _RTE_DMADEV_H_ > +#define _RTE_DMADEV_H_ > + > +/** > + * @file rte_dmadev.h > + * > + * RTE DMA (Direct Memory Access) device APIs. > + * > + * The generic DMA device diagram: > + * > + * ------------ ------------ > + * | HW-queue | | HW-queue | > + * ------------ ------------ > + * \ / > + * \ / > + * \ / > + * ---------------- > + * |dma-controller| > + * ---------------- > + * > + * The DMA could have multiple HW-queues, each HW-queue could have multiple > + * capabilities, e.g. whether to support fill operation, supported DMA > + * transfter direction and etc. typo > + * > + * The DMA framework is built on the following abstraction model: > + * > + * ------------ ------------ > + * |virt-queue| |virt-queue| > + * ------------ ------------ > + * \ / > + * \ / > + * \ / > + * ------------ ------------ > + * | HW-queue | | HW-queue | > + * ------------ ------------ > + * \ / > + * \ / > + * \ / > + * ---------- > + * | dmadev | > + * ---------- Continuing the discussion with @Morten Brørup , I think, we need to finalize the model. > + * a) The DMA operation request must be submitted to the virt queue, virt > + * queues must be created based on HW queues, the DMA device could have > + * multiple HW queues. > + * b) The virt queues on the same HW-queue could represent different > contexts, > + * e.g. user could create virt-queue-0 on HW-queue-0 for mem-to-mem > + * transfer scenario, and create virt-queue-1 on the same HW-queue for > + * mem-to-dev transfer scenario. > + * NOTE: user could also create multiple virt queues for mem-to-mem > transfer > + * scenario as long as the corresponding driver supports. > + * > + * The control plane APIs include configure/queue_setup/queue_release/start/ > + * stop/reset/close, in order to start device work, the call sequence must be > + * as follows: > + * - rte_dmadev_configure() > + * - rte_dmadev_queue_setup() > + * - rte_dmadev_start() Please add reconfigure behaviour etc, Please check the lib/regexdev/rte_regexdev.h introduction. I have added similar ones so you could reuse as much as possible. > + * The dataplane APIs include two parts: > + * a) The first part is the submission of operation requests: > + * - rte_dmadev_copy() > + * - rte_dmadev_copy_sg() - scatter-gather form of copy > + * - rte_dmadev_fill() > + * - rte_dmadev_fill_sg() - scatter-gather form of fill > + * - rte_dmadev_fence() - add a fence force ordering between > operations > + * - rte_dmadev_perform() - issue doorbell to hardware > + * These APIs could work with different virt queues which have different > + * contexts. > + * The first four APIs are used to submit the operation request to the > virt > + * queue, if the submission is successful, a cookie (as type > + * 'dma_cookie_t') is returned, otherwise a negative number is returned. > + * b) The second part is to obtain the result of requests: > + * - rte_dmadev_completed() > + * - return the number of operation requests completed > successfully. > + * - rte_dmadev_completed_fails() > + * - return the number of operation requests failed to complete. > + * > + * The misc APIs include info_get/queue_info_get/stats/xstats/selftest, > provide > + * information query and self-test capabilities. > + * > + * About the dataplane APIs MT-safe, there are two dimensions: > + * a) For one virt queue, the submit/completion API could be MT-safe, > + * e.g. one thread do submit operation, another thread do completion > + * operation. > + * If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VQ. > + * If driver don't support it, it's up to the application to guarantee > + * MT-safe. > + * b) For multiple virt queues on the same HW queue, e.g. one thread do > + * operation on virt-queue-0, another thread do operation on > virt-queue-1. > + * If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MVQ. > + * If driver don't support it, it's up to the application to guarantee > + * MT-safe. >From an application PoV it may not be good to write portable applications. Please check latest thread with @Morten Brørup > + */ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include <rte_common.h> > +#include <rte_memory.h> > +#include <rte_errno.h> > +#include <rte_compat.h> Sort in alphabetical order. > + > +/** > + * dma_cookie_t - an opaque DMA cookie Since we are defining the behaviour is not opaque any more. I think, it is better to call ring_idx or so. > +#define RTE_DMA_DEV_CAPA_MT_MVQ (1ull << 11) /**< Support MT-safe of > multiple virt queues */ Please lot of @see for all symbols where it is being used. So that one can understand the full scope of symbols. See below example. #define RTE_REGEXDEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0) /**< RegEx device does support compiling the rules at runtime unlike * loading only the pre-built rule database using * struct rte_regexdev_config::rule_db in rte_regexdev_configure() * * @see struct rte_regexdev_config::rule_db, rte_regexdev_configure() * @see struct rte_regexdev_info::regexdev_capa */ > + * > + * If dma_cookie_t is >=0 it's a DMA operation request cookie, <0 it's a > error > + * code. > + * When using cookies, comply with the following rules: > + * a) Cookies for each virtual queue are independent. > + * b) For a virt queue, the cookie are monotonically incremented, when it > reach > + * the INT_MAX, it wraps back to zero. > + * c) The initial cookie of a virt queue is zero, after the device is > stopped or > + * reset, the virt queue's cookie needs to be reset to zero. > + * Example: > + * step-1: start one dmadev > + * step-2: enqueue a copy operation, the cookie return is 0 > + * step-3: enqueue a copy operation again, the cookie return is 1 > + * ... > + * step-101: stop the dmadev > + * step-102: start the dmadev > + * step-103: enqueue a copy operation, the cookie return is 0 > + * ... > + */ Good explanation. > +typedef int32_t dma_cookie_t; > + > +/** > + * dma_scatterlist - can hold scatter DMA operation request > + */ > +struct dma_scatterlist { I prefer to change scatterlist -> sg i.e rte_dma_sg > + void *src; > + void *dst; > + uint32_t length; > +}; > + > + > +/** > + * A structure used to retrieve the contextual information of > + * an DMA device > + */ > +struct rte_dmadev_info { > + /** > + * Fields filled by framewok typo. > + */ > + struct rte_device *device; /**< Generic Device information */ > + const char *driver_name; /**< Device driver name */ > + int socket_id; /**< Socket ID where memory is allocated */ > + > + /** > + * Specification fields filled by driver > + */ > + uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */ > + uint16_t max_hw_queues; /**< Maximum number of HW queues. */ > + uint16_t max_vqs_per_hw_queue; > + /**< Maximum number of virt queues to allocate per HW queue */ > + uint16_t max_desc; > + /**< Maximum allowed number of virt queue descriptors */ > + uint16_t min_desc; > + /**< Minimum allowed number of virt queue descriptors */ Please add max_nb_segs. i.e maximum number of segments supported. > + > + /** > + * Status fields filled by driver > + */ > + uint16_t nb_hw_queues; /**< Number of HW queues configured */ > + uint16_t nb_vqs; /**< Number of virt queues configured */ > +}; > + i > + > +/** > + * dma_address_type > + */ > +enum dma_address_type { > + DMA_ADDRESS_TYPE_IOVA, /**< Use IOVA as dma address */ > + DMA_ADDRESS_TYPE_VA, /**< Use VA as dma address */ > +}; > + > +/** > + * A structure used to configure a DMA device. > + */ > +struct rte_dmadev_conf { > + enum dma_address_type addr_type; /**< Address type to used */ I think, there are 3 kinds of limitations/capabilities. When the system is configured as IOVA as VA 1) Device supports any VA address like memory from rte_malloc(), rte_memzone(), malloc, stack memory 2) Device support only VA address from rte_malloc(), rte_memzone() i.e memory backed by hugepage and added to DMA map. When the system is configured as IOVA as PA 1) Devices support only PA addresses . IMO, Above needs to be advertised as capability and application needs to align with that and I dont think application requests the driver to work in any of the modes. > + uint16_t nb_hw_queues; /**< Number of HW-queues enable to use */ > + uint16_t max_vqs; /**< Maximum number of virt queues to use */ You need to what is max value allowed etc i.e it is based on info_get() and mention the field in info structure > + > +/** > + * dma_transfer_direction > + */ > +enum dma_transfer_direction { rte_dma_transter_direction > + DMA_MEM_TO_MEM, > + DMA_MEM_TO_DEV, > + DMA_DEV_TO_MEM, > + DMA_DEV_TO_DEV, > +}; > + > +/** > + * A structure used to configure a DMA virt queue. > + */ > +struct rte_dmadev_queue_conf { > + enum dma_transfer_direction direction; > + /**< Associated transfer direction */ > + uint16_t hw_queue_id; /**< The HW queue on which to create virt queue > */ > + uint16_t nb_desc; /**< Number of descriptor for this virt queue */ > + uint64_t dev_flags; /**< Device specific flags */ Use of this? Need more comments on this. Since it is in slowpath, We can have non opaque names here based on each driver capability. > + void *dev_ctx; /**< Device specific context */ Use of this ? Need more comment ont this. Please add some good amount of reserved bits and have API to init this structure for future ABI stability, say rte_dmadev_queue_config_init() or so. > + > +/** > + * A structure used to retrieve information of a DMA virt queue. > + */ > +struct rte_dmadev_queue_info { > + enum dma_transfer_direction direction; A queue may support all directions so I think it should be a bitfield. > + /**< Associated transfer direction */ > + uint16_t hw_queue_id; /**< The HW queue on which to create virt queue > */ > + uint16_t nb_desc; /**< Number of descriptor for this virt queue */ > + uint64_t dev_flags; /**< Device specific flags */ > +}; > + > +__rte_experimental > +static inline dma_cookie_t > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id, > + const struct dma_scatterlist *sg, > + uint32_t sg_len, uint64_t flags) I would like to change this as: rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id, const struct rte_dma_sg *src, uint32_t nb_src, const struct rte_dma_sg *dst, uint32_t nb_dst) or so allow the use case like src 30 MB copy can be splitted as written as 1 MB x 30 dst. > +{ > + struct rte_dmadev *dev = &rte_dmadevices[dev_id]; > + return (*dev->copy_sg)(dev, vq_id, sg, sg_len, flags); > +} > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice. > + * > + * Enqueue a fill operation onto the DMA virt queue > + * > + * This queues up a fill operation to be performed by hardware, but does not > + * trigger hardware to begin that operation. > + * > + * @param dev_id > + * The identifier of the device. > + * @param vq_id > + * The identifier of virt queue. > + * @param pattern > + * The pattern to populate the destination buffer with. > + * @param dst > + * The address of the destination buffer. > + * @param length > + * The length of the destination buffer. > + * @param flags > + * An opaque flags for this operation. PLEASE REMOVE opaque stuff from fastpath it will be a pain for application writers as they need to write multiple combinations of fastpath. flags are OK, if we have a valid generic flag now to control the transfer behavior. > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice. > + * > + * Add a fence to force ordering between operations > + * > + * This adds a fence to a sequence of operations to enforce ordering, such > that > + * all operations enqueued before the fence must be completed before > operations > + * after the fence. > + * NOTE: Since this fence may be added as a flag to the last operation > enqueued, > + * this API may not function correctly when called immediately after an > + * "rte_dmadev_perform" call i.e. before any new operations are enqueued. > + * > + * @param dev_id > + * The identifier of the device. > + * @param vq_id > + * The identifier of virt queue. > + * > + * @return > + * - =0: Successful add fence. > + * - <0: Failure to add fence. > + * > + * NOTE: The caller must ensure that the input parameter is valid and the > + * corresponding device supports the operation. > + */ > +__rte_experimental > +static inline int > +rte_dmadev_fence(uint16_t dev_id, uint16_t vq_id) > +{ > + struct rte_dmadev *dev = &rte_dmadevices[dev_id]; > + return (*dev->fence)(dev, vq_id); > +} Since HW submission is in a queue(FIFO) the ordering is always maintained. Right? Could you share more details and use case of fence() from driver/application PoV? > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice. > + * > + * Trigger hardware to begin performing enqueued operations > + * > + * This API is used to write the "doorbell" to the hardware to trigger it > + * to begin the operations previously enqueued by rte_dmadev_copy/fill() > + * > + * @param dev_id > + * The identifier of the device. > + * @param vq_id > + * The identifier of virt queue. > + * > + * @return > + * - =0: Successful trigger hardware. > + * - <0: Failure to trigger hardware. > + * > + * NOTE: The caller must ensure that the input parameter is valid and the > + * corresponding device supports the operation. > + */ > +__rte_experimental > +static inline int > +rte_dmadev_perform(uint16_t dev_id, uint16_t vq_id) > +{ > + struct rte_dmadev *dev = &rte_dmadevices[dev_id]; > + return (*dev->perform)(dev, vq_id); > +} Since we have additional function call overhead in all the applications for this scheme, I would like to understand the use of doing this way vs enq does the doorbell implicitly from driver/application PoV? > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice. > + * > + * Returns the number of operations that have been successful completed. > + * > + * @param dev_id > + * The identifier of the device. > + * @param vq_id > + * The identifier of virt queue. > + * @param nb_cpls > + * The maximum number of completed operations that can be processed. > + * @param[out] cookie > + * The last completed operation's cookie. > + * @param[out] has_error > + * Indicates if there are transfer error. > + * > + * @return > + * The number of operations that successful completed. successfully > + * > + * NOTE: The caller must ensure that the input parameter is valid and the > + * corresponding device supports the operation. > + */ > +__rte_experimental > +static inline uint16_t > +rte_dmadev_completed(uint16_t dev_id, uint16_t vq_id, const uint16_t nb_cpls, > + dma_cookie_t *cookie, bool *has_error) > +{ > + struct rte_dmadev *dev = &rte_dmadevices[dev_id]; > + has_error = false; > + return (*dev->completed)(dev, vq_id, nb_cpls, cookie, has_error); It may be better to have cookie/ring_idx as third argument. > +} > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice. > + * > + * Returns the number of operations that failed to complete. > + * NOTE: This API was used when rte_dmadev_completed has_error was set. > + * > + * @param dev_id > + * The identifier of the device. > + * @param vq_id > + * The identifier of virt queue. (> + * @param nb_status > + * Indicates the size of status array. > + * @param[out] status > + * The error code of operations that failed to complete. > + * @param[out] cookie > + * The last failed completed operation's cookie. > + * > + * @return > + * The number of operations that failed to complete. > + * > + * NOTE: The caller must ensure that the input parameter is valid and the > + * corresponding device supports the operation. > + */ > +__rte_experimental > +static inline uint16_t > +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id, > + const uint16_t nb_status, uint32_t *status, > + dma_cookie_t *cookie) IMO, it is better to move cookie/rind_idx at 3. Why it would return any array of errors? since it called after rte_dmadev_completed() has has_error. Is it better to change rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t *cookie, uint32_t *status) I also think, we may need to set status as bitmask and enumerate all the combination of error codes of all the driver and return string from driver existing rte_flow_error See struct rte_flow_error { enum rte_flow_error_type type; /**< Cause field and error types. */ const void *cause; /**< Object responsible for the error. */ const char *message; /**< Human-readable error message. */ }; > +{ > + struct rte_dmadev *dev = &rte_dmadevices[dev_id]; > + return (*dev->completed_fails)(dev, vq_id, nb_status, status, cookie); > +} > + > +struct rte_dmadev_stats { > + uint64_t enqueue_fail_count; > + /**< Conut of all operations which failed enqueued */ > + uint64_t enqueued_count; > + /**< Count of all operations which successful enqueued */ > + uint64_t completed_fail_count; > + /**< Count of all operations which failed to complete */ > + uint64_t completed_count; > + /**< Count of all operations which successful complete */ > +}; We need to have capability API to tell which items are updated/supported by the driver. > diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h > new file mode 100644 > index 0000000..a3afea2 > --- /dev/null > +++ b/lib/dmadev/rte_dmadev_core.h > @@ -0,0 +1,98 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright 2021 HiSilicon Limited. > + */ > + > +#ifndef _RTE_DMADEV_CORE_H_ > +#define _RTE_DMADEV_CORE_H_ > + > +/** > + * @file > + * > + * RTE DMA Device internal header. > + * > + * This header contains internal data types. But they are still part of the > + * public API because they are used by inline public functions. > + */ > + > +struct rte_dmadev; > + > +typedef dma_cookie_t (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vq_id, > + void *src, void *dst, > + uint32_t length, uint64_t flags); > +/**< @internal Function used to enqueue a copy operation. */ To avoid namespace conflict(as it is public API) use rte_ > + > +/** > + * The data structure associated with each DMA device. > + */ > +struct rte_dmadev { > + /**< Enqueue a copy operation onto the DMA device. */ > + dmadev_copy_t copy; > + /**< Enqueue a scatter list copy operation onto the DMA device. */ > + dmadev_copy_sg_t copy_sg; > + /**< Enqueue a fill operation onto the DMA device. */ > + dmadev_fill_t fill; > + /**< Enqueue a scatter list fill operation onto the DMA device. */ > + dmadev_fill_sg_t fill_sg; > + /**< Add a fence to force ordering between operations. */ > + dmadev_fence_t fence; > + /**< Trigger hardware to begin performing enqueued operations. */ > + dmadev_perform_t perform; > + /**< Returns the number of operations that successful completed. */ > + dmadev_completed_t completed; > + /**< Returns the number of operations that failed to complete. */ > + dmadev_completed_fails_t completed_fails; We need to limit fastpath items in 1 CL > + > + void *dev_private; /**< PMD-specific private data */ > + const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD > */ > + > + uint16_t dev_id; /**< Device ID for this instance */ > + int socket_id; /**< Socket ID where memory is allocated */ > + struct rte_device *device; > + /**< Device info. supplied during device initialization */ > + const char *driver_name; /**< Driver info. supplied by probing */ > + char name[RTE_DMADEV_NAME_MAX_LEN]; /**< Device name */ > + > + RTE_STD_C11 > + uint8_t attached : 1; /**< Flag indicating the device is attached */ > + uint8_t started : 1; /**< Device state: STARTED(1)/STOPPED(0) */ Add a couple of reserved fields for future ABI stability. > + > +} __rte_cache_aligned; > + > +extern struct rte_dmadev rte_dmadevices[]; > +