06/10/2022 13:00, Dariusz Sosnowski: > The hairpin queues are used to transmit packets received on the wire, back to > the wire. > How hairpin queues are implemented and configured is decided internally by > the PMD and > applications have no control over the configuration of Rx and Tx hairpin > queues. > This patchset addresses that by: > > - Extending hairpin queue capabilities reported by PMDs. > - Exposing new configuration options for Rx and Tx hairpin queues. > > Main goal of this patchset is to allow applications to provide configuration > hints > regarding memory placement of hairpin queues. > These hints specify whether buffers of hairpin queues should be placed in > host memory > or in dedicated device memory. > > For example, in context of NVIDIA Connect-X and BlueField devices, > this distinction is important for several reasons: > > - By default, data buffers and packet descriptors are placed in device memory > region > which is shared with other resources (e.g. flow rules). > This results in memory contention on the device, > which may lead to degraded performance under heavy load. > - Placing hairpin queues in dedicated device memory can decrease latency of > hairpinned traffic, > since hairpin queue processing will not be memory starved by other > operations. > Side effect of this memory configuration is that it leaves less memory for > other resources, > possibly causing memory contention in non-hairpin traffic. > - Placing hairpin queues in host memory can increase throughput of hairpinned > traffic at the cost of increasing latency. > Each packet processed by hairpin queues will incur additional PCI > transactions (increase in latency), > but memory contention on the device is avoided. > > Depending on the workload and whether throughput or latency has a higher > priority for developers, > it would be beneficial if developers could choose the best hairpin > configuration for their use case. > > To address that, this patchset adds the following configuration options (in > rte_eth_hairpin_conf struct): > > - use_locked_device_memory - If set, PMD will allocate specialized on-device > memory for the queue. > - use_rte_memory - If set, PMD will use DPDK-managed memory for the queue. > - force_memory - If set, PMD will be forced to use provided memory > configuration. > If no appropriate resources are available, the queue allocation will fail. > If unset and no appropriate resources are available, PMD will fallback to > its default behavior. > > Implementing support for these flags is optional and applications should be > allowed to not set any of these new flags. > This will result in default memory configuration provided by the PMD. > Application developers should consult the PMD documentation in that case. > > These changes were originally proposed in > http://patches.dpdk.org/project/dpdk/patch/20220811120530.191683-1-dsosnow...@nvidia.com/. > > Dariusz Sosnowski (8): > ethdev: introduce hairpin memory capabilities > common/mlx5: add hairpin SQ buffer type capabilities > common/mlx5: add hairpin RQ buffer type capabilities > net/mlx5: allow hairpin Tx queue in RTE memory > net/mlx5: allow hairpin Rx queue in locked memory > doc: add notes for hairpin to mlx5 documentation > app/testpmd: add hairpin queues memory modes > app/flow-perf: add hairpin queue memory config
Doc squashed in mlx5 commits. Applied, thanks.