Hi Tim, > doc/guides/eventdevs/dlb.rst | 497 > ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 497 insertions(+) > create mode 100644 doc/guides/eventdevs/dlb.rst > > diff --git a/doc/guides/eventdevs/dlb.rst b/doc/guides/eventdevs/dlb.rst new > file mode 100644 index 0000000..21e48fe > --- /dev/null > +++ b/doc/guides/eventdevs/dlb.rst > @@ -0,0 +1,497 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2020 Intel Corporation. > + > +Driver for the IntelĀ® Dynamic Load Balancer (DLB) > +================================================== > + > +The DPDK dlb poll mode driver supports the IntelĀ® Dynamic Load Balancer. > + > +.. note:: > + > + This PMD is disabled by default in the build configuration files, owing > to > + an external dependency on the `Netlink Protocol Library Suite > + <http://www.infradead.org/~tgr/libnl/>`_ (libnl-3 and libnl-genl-3) which > + must be installed on the board. Once the Netlink libraries are > installed, > + the PMD can be enabled by setting CONFIG_RTE_LIBRTE_PMD_DLB_QM=y > and > + recompiling the DPDK. > +
This description appears to be out-of-date. > +Prerequisites > +------------- > + > +- Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to > +setup > + the basic DPDK environment. > + > +- Learn about the DLB device and its capabilities at `Intel Support > + <http://www.intel.com/support>`_. FIXME: Add real link when > +documentation > + becomes available. Leftover FIXME > + > +- The DLB kernel module. If it is not included in the machine's OS > + distribution, download it from <FIXME: Add 01.org link when > +available> and > + follow the build instructions. > + Leftover FIXME <snip> > +The hybrid timeout data structures are currently located in > +drivers/event/dlb/dlb_timeout.h: > + > +.. code-block:: c > + > + struct rte_hybrid_timeout_ticks_64 { > + RTE_STD_C11 > + union { > + uint64_t val64; > + struct { > + uint64_t poll_ticks:62; > + uint64_t umonitor_wait:1; > + uint64_t interrupt_wait:1; > + }; > + }; > + }; > + struct rte_hybrid_timeout_ns_32 { > + RTE_STD_C11 > + union { > + uint32_t val32; > + struct { > + uint32_t poll_ns:30; > + uint32_t umonitor_wait:1; > + uint32_t interrupt_wait:1; > + }; > + }; > + }; Is this description correct? dlb_timeout.h isn't introduced in this patchset. > + > +VAS Configuration > +~~~~~~~~~~~~~~~~~ > + > +A VAS is a scheduling domain, of which there are 32 in the DLB. > +(Producer ports in one VAS cannot enqueue events to a different VAS, > +except through the `Data Mover`_.) When a VAS is configured, it I believe this cross-VAS comment is out-of-date. > +allocates load-balanced and directed queues, ports, credits, and other > +hardware resources. Some VAS resource allocations are user-controlled > +-- the number of queues, for example > +-- and others, like credit pools (one directed and one load-balanced > +pool per VAS), are not. > + > +The dlb PMD creates a single VAS per DLB device. Supporting multiple > +VASes per DLB device is a planned feature, where each VAS will be > +represented as a separate event device. Is this comment up-to-date? Patch 16 ("event/dlb: add infos_get and configure") indicates that multiple event devices are supported. <snip> > +Hardware Credits > +~~~~~~~~~~~~~~~~ > + > +DLB uses a hardware credit scheme to prevent software from overflowing > +hardware event storage, with each unit of storage represented by a > +credit. A port spends a credit to enqueue an event, and hardware > +refills the ports with credits as the events are scheduled to ports. > +Refills come from credit pools, and each port is a member of a > +load-balanced credit pool and a directed credit pool. The load-balanced > +credits are used to enqueue to load-balanced queues, and directed credits > are used for directed queues. > + > +An dlb eventdev contains one load-balanced and one directed credit "An dlb" -> "A dlb" > +pool. These pools' sizes are controlled by the nb_events_limit field in > +struct rte_event_dev_config. The load-balanced pool is sized to contain > +nb_events_limit credits, and the directed pool is sized to contain > +nb_events_limit/4 credits. The directed pool size can be overriden with > +the num_dir_credits vdev argument, like so: > + > + .. code-block:: console > + > + --vdev=dlb1_event,num_dir_credits=<value> > + > +This can be used if the default allocation is too low or too high for > +the specific application needs. The PMD also supports a vdev arg that > +limits the max_num_events reported by rte_event_dev_info_get(): > + > + .. code-block:: console > + > + --vdev=dlb1_event,max_num_events=<value> > + > +By default, max_num_events is reported as the total available > +load-balanced credits. If multiple DLB-based applications are being > +used, it may be desirable to control how many load-balanced credits > +each application uses, particularly when application(s) are written to > +configure nb_events_limit equal to the reported max_num_events. > + > +Each port is a member of both credit pools. A port's credit allocation > +is defined by its low watermark, high watermark, and refill quanta. > +These three parameters are calculated by the dlb PMD like so: > + > +- The load-balanced high watermark is set to the port's enqueue_depth. > + The directed high watermark is set to the minimum of the > +enqueue_depth and > + the directed pool size divided by the total number of ports. > +- The refill quanta is set to half the high watermark. > +- The low watermark is set to the minimum of 8 and the refill quanta. From patch 19 ("event/dlb: add port_setup"), this should be 16 instead of 8: + cfg.ldb_credit_quantum = cfg.ldb_credit_high_watermark / 2; + cfg.ldb_credit_low_watermark = RTE_MIN(16, cfg.ldb_credit_quantum); + + cfg.dir_credit_quantum = cfg.dir_credit_high_watermark / 2; + cfg.dir_credit_low_watermark = RTE_MIN(16, cfg.dir_credit_quantum); > + > +When the eventdev is started, each port is pre-allocated a high > +watermark's worth of credits. For example, if an eventdev contains four > +ports with enqueue depths of 32 and a load-balanced credit pool size of > +4096, each port will start with 32 load-balanced credits, and there > +will be 3968 credits available to replenish the ports. Thus, a single > +port is not capable of enqueueing up to the nb_events_limit (without > +any events being dequeued), since the other ports are retaining their > +initial credit allocation; in short, all ports must enqueue in order to reach > the limit. > + > +If a port attempts to enqueue and has no credits available, the enqueue > +operation will fail and the application must retry the enqueue. Credits > +are replenished asynchronously by the DLB hardware. > + <snip> > + > +Ordered Fragments > +~~~~~~~~~~~~~~~~~ > + > +The DLB has a fourth enqueue type: partial enqueue. When a thread is > +processing an ordered event, it can perform up to 16 "partial" > +enqueues, which allows a single received ordered event to result in multiple > reordered events. > + > +For example, consider the case where three events (A, then B, then C) > +are enqueued with ordered scheduling and are received by three different > ports. > +The ports that receive A and C forward events A' and C', while the port > +that receives B generates three partial enqueues -- B1', B2', and B3' > +-- followed by a release operation. The DLB will reorder the events in the > following order: > + > +A', B1', B2', B3', C' > + > +This functionality is not available explicitly through the eventdev > +API, but the dlb PMD provides it through an additional (DLB-specific) > +event operation, RTE_EVENT_DLB_OP_FRAG. I don't believe this OP type appears in this patchset. > + > +Deferred Scheduling > +~~~~~~~~~~~~~~~~~~~ > + > +The DLB PMD's default behavior for managing a CQ is to "pop" the CQ > +once per dequeued event before returning from > +rte_event_dequeue_burst(). This frees the corresponding entries in the > +CQ, which enables the DLB to schedule more events to it. > + > +To support applications seeking finer-grained scheduling control -- for > +example deferring scheduling to get the best possible priority > +scheduling and load-balancing -- the PMD supports a deferred scheduling > +mode. In this mode, the CQ entry is not popped until the *subsequent* > +rte_event_dequeue_burst() call. This mode only applies to load-balanced > +event ports with dequeue depth of 1. > + > +To enable deferred scheduling, use the defer_sched vdev argument like so: > + > + .. code-block:: console > + > + --vdev=dlb1_event,defer_sched=on > + > +Atomic Inflights Allocation > +~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +In the last stage prior to scheduling an atomic event to a CQ, DLB > +holds the inflight event in a temporary buffer that is divided among > +load-balanced queues. If a queue's atomic buffer storage fills up, this > +can result in head-of-line-blocking. For example: > +- An LDB queue allocated N atomic buffer entries > +- All N entries are filled with events from flow X, which is pinned to CQ 0. > + > +Until CQ 0 releases 1+ events, no other atomic flows for that LDB queue > +can be scheduled. The likelihood of this case depends on the eventdev > +configuration, traffic behavior, event processing latency, potential > +for a worker to be interrupted or otherwise delayed, etc. > + > +By default, the PMD allocates 16 buffer entries for each load-balanced > +queue, which provides an even division across all 128 queues but > +potentially wastes buffer space (e.g. if not all queues are used, or > +aren't used for atomic scheduling). > + > +The PMD provides a dev arg to override the default per-queue > +allocation. To increase a vdev's per-queue atomic-inflight allocation to (for > example) 64: > + > + .. code-block:: console > + > + --vdev=dlb1_event,atm_inflights=64 > + This section is duplicated below. > +Atomic Inflights Allocation > +~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +In the last stage prior to scheduling an atomic event to a CQ, DLB > +holds the inflight event in a temporary buffer that is divided among > +load-balanced queues. If a queue's atomic buffer storage fills up, this > +can result in head-of-line-blocking. For example: > +- An LDB queue allocated N atomic buffer entries > +- All N entries are filled with events from flow X, which is pinned to CQ 0. > + > +Until CQ 0 releases 1+ events, no other atomic flows for that LDB queue > +can be scheduled. The likelihood of this case depends on the eventdev > +configuration, traffic behavior, event processing latency, potential > +for a worker to be interrupted or otherwise delayed, etc. > + > +By default, the PMD allocates 16 buffer entries for each load-balanced > +queue, which provides an even division across all 128 queues but > +potentially wastes buffer space (e.g. if not all queues are used, or > +aren't used for atomic scheduling). > + > +The PMD provides a dev arg to override the default per-queue > +allocation. To increase a vdev's per-queue atomic-inflight allocation to (for > example) 64: > + > + .. code-block:: console > + > + --vdev=dlb1_event,atm_inflights=64 > -- > 1.7.10 Thanks, Gage