In mlx5 PMD, there are multiple Tx burst functions, mlx5_tx_burst() mlx5_tx_burst_mpw() mlx5_tx_burst_mpw_inline() mlx5_tx_burst_burst_empw() mlx5_tx_burst_raw_vec() mlx5_tx_burst_vec()
To provide better user experience and the best out-of-box performance, those will need to be consolidated. There will be only one non-vector function. As mlx5_tx_burst_vec() calls mlx5_tx_burst_raw_vec(), there'll be no change with vector fuctions. The reason for multiple Tx burst functions was because newer device has enhanced features to improve throughput by further saving PCIe BW. For the new features (e.g. Tx packet inlining), new Tx burst functions had been added incrementally. Such new functions were to support new type of Tx descriptors. However, problem with selecting a Tx burst statically is, although newer devices support all the descriptor types including legacy ones, the new function doesn't fall back to the old modes. Another issue is that it is very hard to introduce a new feature on Tx path. For example, mlx5 supports TSO but currently it is only supported by the basic mlx5_tx_burst(). We could've added TSO support to other Tx bursts but it is so much painful to add the same code in multiple locations. And it isn't even a good idea from maintenance perspective. As a result, even though a user wants to enjoy Mellanox's best-in-class performance, if TSO is required, mlx5 PMD can't satisfy the user. The consolidated Tx burst function will be all-inclusive. This will support all types of Tx descriptors (WQE) and HW offloads. WQE type for a transmitting packet would be determined dynamically. Decision for packet inline will be made by sensing PCIe bottleneck. And selection between the consolidated function and the existing vector function will still be done during configuration. But CPU architecture will also be taken into account. Signed-off-by: Yongseok Koh <ys...@mellanox.com>