On Wed, Sep 13, 2017 at 01:50:39PM +0300, Shahaf Shuler wrote: > Mellanox NICs has a limitation on the number of mbuf segments a multi > segment mbuf can have. The max number depends on the Tx offloads requested. > > The current code not enforce such limitation, which might cause > malformed work requests to be written to the device. > > This commit adds verification for the number of mbuf segments posted > to the device. In case of overflow the packet will not be sent. > > In addition update the nic documentation with the limitation. > Considering device limitation is 63 data segments in a work request, the > maximum number of segment in mbuf was calculated taking TSO as the worst > case: > > max_nb_segs = 63 - (control_segment + ethernet segment + > TSO headers inline + inline segment + > extra inline to align to cacheline) > > Cc: sta...@dpdk.org > > Signed-off-by: Shahaf Shuler <shah...@mellanox.com> > --- > doc/guides/nics/mlx5.rst | 2 ++ > drivers/net/mlx5/mlx5_defs.h | 3 ++- > drivers/net/mlx5/mlx5_prm.h | 3 +++ > drivers/net/mlx5/mlx5_rxtx.c | 4 ++++ > drivers/net/mlx5/mlx5_rxtx_vec_sse.c | 5 +++++ > drivers/net/mlx5/mlx5_txq.c | 27 +++++++++++++++++++++++++++ > 6 files changed, 43 insertions(+), 1 deletion(-) > > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst > index f4cb18bca..d8244de97 100644 > --- a/doc/guides/nics/mlx5.rst > +++ b/doc/guides/nics/mlx5.rst > @@ -124,6 +124,8 @@ Limitations > > Will match any ipv4 packet (VLAN included). > > +- A multi segment mbuf must have less than 50 segments. That means > mbuf->nb_segs < 50. Isn't it better to use either "multiple segment packet" or "multi-segment packet"? Also, more information might be needed here. If MPW/eMPW is enabled, the code restricts the max number of segments up to MLX5_MPW_DSEG_MAX(5).
> + > Configuration > ------------- > > diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h > index a76bc6f65..3de0e5d81 100644 > --- a/drivers/net/mlx5/mlx5_defs.h > +++ b/drivers/net/mlx5/mlx5_defs.h > @@ -100,7 +100,8 @@ > > /* > * Maximum size of burst for vectorized Tx. This is related to the maximum > size > - * of Enhaned MPW (eMPW) WQE as vectorized Tx is supported with eMPW. > + * of Enhanced MPW (eMPW) WQE as vectorized Tx is supported with eMPW. > + * Careful when changing, large value can cause wqe DS to overlap. wqe -> WQE. > */ > #define MLX5_VPMD_TX_MAX_BURST 32U > > diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h > index 608072f7e..bc2b72333 100644 > --- a/drivers/net/mlx5/mlx5_prm.h > +++ b/drivers/net/mlx5/mlx5_prm.h > @@ -154,6 +154,9 @@ > /* Default mark value used when none is provided. */ > #define MLX5_FLOW_MARK_DEFAULT 0xffffff > > +/* Maximum number of DS in WQE. */ > +#define MLX5_MAX_DS 63 How about make it consistent with MLX5_MPW_DSEG_MAX by naming MLX5_DSEG_MAX? > + > /* Subset of struct mlx5_wqe_eth_seg. */ > struct mlx5_wqe_eth_seg_small { > uint32_t rsvd0; > diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c > index 7567f2329..fdd7067da 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.c > +++ b/drivers/net/mlx5/mlx5_rxtx.c > @@ -661,6 +661,10 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, > uint16_t pkts_n) > else > j += sg; > next_pkt: > + if (ds > MLX5_MAX_DS) { > + txq->stats.oerrors++; > + break; > + } > ++elts_head; > ++pkts; > ++i; > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.c > b/drivers/net/mlx5/mlx5_rxtx_vec_sse.c > index f89762ff8..3583e6780 100644 > --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.c > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.c > @@ -248,6 +248,10 @@ txq_scatter_v(struct txq *txq, struct rte_mbuf **pkts, > uint16_t pkts_n) > if (segs_n == 1 || > max_elts < segs_n || max_wqe < 2) > break; > + if (segs_n > MLX5_MPW_DSEG_MAX) { > + txq->stats.oerrors++; > + break; > + } > wqe = &((volatile struct mlx5_wqe64 *) > txq->wqes)[wqe_ci & wq_mask].hdr; > if (buf->ol_flags & > @@ -365,6 +369,7 @@ txq_burst_v(struct txq *txq, struct rte_mbuf **pkts, > uint16_t pkts_n, > max_elts = (elts_n - (elts_head - txq->elts_tail)); > max_wqe = (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi); > pkts_n = RTE_MIN((unsigned int)RTE_MIN(pkts_n, max_wqe), max_elts); > + assert(pkts_n <= MLX5_MAX_DS - nb_dword_in_hdr); > if (unlikely(!pkts_n)) > return 0; > elts = &(*txq->elts)[elts_head & elts_m]; > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c > index 4b0b532b1..091b1a93d 100644 > --- a/drivers/net/mlx5/mlx5_txq.c > +++ b/drivers/net/mlx5/mlx5_txq.c > @@ -288,6 +288,8 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl > *txq_ctrl, > .comp_mask = IBV_EXP_QP_INIT_ATTR_PD, > }; > if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) { > + unsigned int ds_cnt; > + > tmpl.txq.max_inline = > ((priv->txq_inline + (RTE_CACHE_LINE_SIZE - 1)) / > RTE_CACHE_LINE_SIZE); > @@ -320,6 +322,31 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl > *txq_ctrl, > attr.init.cap.max_inline_data = > tmpl.txq.max_inline * RTE_CACHE_LINE_SIZE; > } > + /* > + * Check if the inline size is too large in a way which > + * can make the wqe DS to overflow. wqe -> WQE. > + * Considering in calculation: > + * WQE CTRL (1 DS) > + * WQE ETH (1 DS) > + * inline part (N DS) inline -> Inline ? > + */ > + ds_cnt = 2 + > + (attr.init.cap.max_inline_data / MLX5_WQE_DWORD_SIZE); > + if (ds_cnt > MLX5_MAX_DS) { > + unsigned int max_inline = (MLX5_MAX_DS - 2) * > + MLX5_WQE_DWORD_SIZE; > + > + /* Ceil down*/ Missing space and period. Rather, this comment could be unnecessary as the following code is so obvious. Or, you might want to explain why you make it aligned. > + max_inline = max_inline - (max_inline % > + RTE_CACHE_LINE_SIZE); > + WARN("txq inline is too large (%d) setting it to " > + "the maximum possible: %d\n", > + priv->txq_inline, max_inline); > + tmpl.txq.max_inline = max_inline / RTE_CACHE_LINE_SIZE; > + attr.init.cap.max_inline_data = max_inline; > + if (priv->mps == MLX5_MPW_ENHANCED) > + tmpl.txq.inline_max_packet_sz = max_inline; No need to set inline_max_packet_sz. inline_max_packet_sz is to limit the max size of a packet which can be inlined in eMPW mode. As long as txq->max_inline is correctly set, txq->inline_max_packet_sz doesn't affect the total number of DSEGs in a WQE. Thanks, Yongseok