This series adds per-queue Tx data-rate limiting to the mlx5 PMD using
hardware packet pacing (PP), and a symmetric rte_eth_get_queue_rate_limit()
ethdev API to read back the configured rate.
Each Tx queue can be assigned an individual rate (in Mbps) at runtime via
rte_eth_set_queue_rate_limit(). The mlx5 implementation allocates a PP
context per queue from the HW rate table, programs the PP index into the
SQ via modify_sq, and relies on the kernel to share identical rates
across PP contexts to conserve table entries. A PMD-specific API exposes
per-queue PP diagnostics and rate table capacity.
Patch breakdown:
01/10 doc/nics/mlx5: fix stale packet pacing documentation
02/10 common/mlx5: query packet pacing rate table capabilities
03/10 common/mlx5: extend SQ modify to support rate limit update
04/10 net/mlx5: add per-queue packet pacing infrastructure
05/10 net/mlx5: support per-queue rate limiting
06/10 net/mlx5: add burst pacing devargs
07/10 net/mlx5: add testpmd command to query per-queue rate limit
08/10 ethdev: add getter for per-queue Tx rate limit
09/10 net/mlx5: implement per-queue Tx rate limit getter
10/10 net/mlx5: add rate table capacity query API
Release notes for the new ethdev API and mlx5 per-queue rate
limiting can be added to a release_26_07.rst once the file is
created at the start of the 26.07 development cycle.
Changes since v3:
Addressed review feedback from Stephen and Slava (nvidia/Mellanox).
Patch 02/10 (query caps):
- Added Acked-by: Viacheslav Ovsiienko
Patch 03/10 (SQ modify):
- Define MLX5_MODIFY_SQ_IN_MODIFY_BITMASK_PACKET_PACING_RATE_LIMIT_INDEX
enum in mlx5_prm.h, following the MLX5_MODIFY_RQ_IN_MODIFY_xxx pattern
- Use read-modify-write for modify_bitmask (MLX5_GET64 | OR | MLX5_SET64)
instead of direct overwrite, for forward compatibility
Patch 04/10 (PP infrastructure):
- Rename struct member and parameters from "rl" to "rate_limit"
for consistency with codebase naming style
- Replace MLX5_ASSERT(rate_mbps > 0) with runtime check returning
-EINVAL in non-debug builds
- Move mlx5_txq_free_pp_rate_limit() to after txq_obj_release() in
mlx5_txq_release() — destroy the SQ before freeing the PP index
it references
- Clarify commit message: distinct PP handle per queue (for cleanup)
but kernel shares the same pp_id for identical rate parameters
Patch 05/10 (set rate):
- Fix obj->sq vs obj->sq_obj.sq: use obj->sq_obj.sq from the start
for non-hairpin queues (was introduced in patch 07 in v3, breaking
git bisect)
- Move all variable declarations to block top (sq_devx,
new_rate_limit)
- Add queue state check: reject set_queue_rate_limit if queue is not
STARTED (SQ not in RDY state)
- Update mlx5 feature matrix: Rate limitation = Y
- Add Per-Queue Tx Rate Limiting documentation section in mlx5.rst
covering DevX requirement, hardware support, rate table sharing,
and testpmd usage
Patch 06/10 (burst devargs):
- Remove burst_upper_bound/typical_packet_size from Clock Queue
path (mlx5_txpp_alloc_pp_index) — Clock Queue uses WQE rate
pacing and does not need these parameters
- Update commit message and documentation accordingly
Patch 07/10 (testpmd + PMD query):
- sq_obj.sq accessor change moved to patch 05 (see above)
- sq_devx declaration moved to block top
Patch 08/10 (ethdev getter) — split from v3 patch 08:
- Split into ethdev API (this patch) and mlx5 driver (patch 09)
- Add rte_eth_trace_get_queue_rate_limit() trace point matching
the existing setter pattern
Patch 09/10 — NEW (was part of v3 patch 08):
- mlx5 driver implementation of get_queue_rate_limit callback,
split out per Slava's request
Patch 10/10 (rate table query):
- Rename struct field "used" to "port_used" to clarify per-port
scope
- Strengthen Doxygen: rate table is a global shared HW resource
(firmware, kernel, other DPDK instances may consume entries);
port_used is a lower bound
- Document PP sharing behavior with flags=0
- Note that applications should aggregate across ports for
device-wide visibility
Changes since v2:
Addressed review feedback from Stephen Hemminger:
Patch 04: cleaned redundant cast parentheses on (struct mlx5dv_pp *)
Patch 04: consolidated dv_alloc_pp call onto one line
Patch 05+08: removed redundant queue_idx bounds checks from driver
callbacks — ethdev layer is the single validation point
Patch 07: added generic testpmd command: show port <id> queue <id> rate
Patch 08+10: removed release notes from release_26_03.rst (targets 26.07)
Patch 10: use MLX5_MEM_SYS | MLX5_MEM_ZERO for heap allocation
Patch 10: consolidated packet_pacing_rate_table_size onto one line
Changes since v1:
Patch 01: Acked-by Viacheslav Ovsiienko
Patch 04: rate bounds validation, uint64_t overflow fix, remove
early PP free
Patch 05: PP leak fix (temp struct pattern), rte_errno in error paths
Patch 07: inverted rte_eth_tx_queue_is_valid() check
Patch 10: stack array replaced with heap, per-port scope documented
Testing:
- Build: GCC, no warnings
- Hardware: ConnectX-6 Dx
- DevX path (default): set/get/disable rate limiting verified
- Verbs path (dv_flow_en=0): returns -EINVAL cleanly (SQ DevX
object not available), no crash
Vincent Jardin (10):
doc/nics/mlx5: fix stale packet pacing documentation
common/mlx5: query packet pacing rate table capabilities
common/mlx5: extend SQ modify to support rate limit update
net/mlx5: add per-queue packet pacing infrastructure
net/mlx5: support per-queue rate limiting
net/mlx5: add burst pacing devargs
net/mlx5: add testpmd command to query per-queue rate limit
ethdev: add getter for per-queue Tx rate limit
net/mlx5: implement per-queue Tx rate limit getter
net/mlx5: add rate table capacity query API
Vincent Jardin (10):
doc/nics/mlx5: fix stale packet pacing documentation
common/mlx5: query packet pacing rate table capabilities
common/mlx5: extend SQ modify to support rate limit update
net/mlx5: add per-queue packet pacing infrastructure
net/mlx5: support per-queue rate limiting
net/mlx5: add burst pacing devargs
net/mlx5: add testpmd command to query per-queue rate limit
ethdev: add getter for per-queue Tx rate limit
net/mlx5: implement per-queue Tx rate limit getter
net/mlx5: add rate table capacity query API
app/test-pmd/cmdline.c | 69 ++++++++++
doc/guides/nics/features/mlx5.ini | 1 +
doc/guides/nics/mlx5.rst | 180 ++++++++++++++++++++++-----
drivers/common/mlx5/mlx5_devx_cmds.c | 23 ++++
drivers/common/mlx5/mlx5_devx_cmds.h | 14 ++-
drivers/common/mlx5/mlx5_prm.h | 7 ++
drivers/net/mlx5/mlx5.c | 46 +++++++
drivers/net/mlx5/mlx5.h | 13 ++
drivers/net/mlx5/mlx5_testpmd.c | 93 ++++++++++++++
drivers/net/mlx5/mlx5_tx.c | 104 +++++++++++++++-
drivers/net/mlx5/mlx5_tx.h | 5 +
drivers/net/mlx5/mlx5_txpp.c | 84 +++++++++++++
drivers/net/mlx5/mlx5_txq.c | 149 ++++++++++++++++++++++
drivers/net/mlx5/rte_pmd_mlx5.h | 74 +++++++++++
lib/ethdev/ethdev_driver.h | 7 ++
lib/ethdev/ethdev_trace.h | 9 ++
lib/ethdev/ethdev_trace_points.c | 3 +
lib/ethdev/rte_ethdev.c | 35 ++++++
lib/ethdev/rte_ethdev.h | 24 ++++
19 files changed, 906 insertions(+), 33 deletions(-)
--
2.43.0