On Wed, Jul 05, 2017 at 10:21:26AM +0200, Nélio Laranjeiro wrote: > On Tue, Jul 04, 2017 at 05:38:44PM -0700, Yongseok Koh wrote: > > On Tue, Jul 04, 2017 at 10:58:52AM +0200, Nélio Laranjeiro wrote: > > > Yongseok, some comments in this huge and great work, > > > > > > On Fri, Jun 30, 2017 at 12:23:33PM -0700, Yongseok Koh wrote: > > > > To make vectorized burst routines enabled, it is required to run on > > > > x86_64 > > > > architecture which can support at least SSE4.1. If all the conditions > > > > are > > > > met, the vectorized burst functions are enabled automatically. The > > > > decision > > > > is made individually on RX and TX. There's no PMD option to make a > > > > selection. > > > > > > > > Signed-off-by: Yongseok Koh <ys...@mellanox.com> > > > > --- > > > > drivers/net/mlx5/Makefile | 10 + > > > > drivers/net/mlx5/mlx5_defs.h | 18 + > > > > drivers/net/mlx5/mlx5_ethdev.c | 28 +- > > > > drivers/net/mlx5/mlx5_rxq.c | 55 +- > > > > drivers/net/mlx5/mlx5_rxtx.c | 339 ++------ > > > > drivers/net/mlx5/mlx5_rxtx.h | 283 ++++++- > > > > drivers/net/mlx5/mlx5_rxtx_vec_sse.c | 1451 > > > > ++++++++++++++++++++++++++++++++++ > > > > drivers/net/mlx5/mlx5_txq.c | 2 +- > > > > 8 files changed, 1909 insertions(+), 277 deletions(-) > > > > create mode 100644 drivers/net/mlx5/mlx5_rxtx_vec_sse.c > > > > > > > > diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h > > > > index 51e258a15..2d0894fcd 100644 > > > > --- a/drivers/net/mlx5/mlx5_rxtx.h > > > > +++ b/drivers/net/mlx5/mlx5_rxtx.h > > [...] > > > > + txq_complete(txq); > > > > + /* A CQE slot must always be available. */ > > > > + assert((1u << txq->cqe_n) - (txq->cq_pi - txq->cq_ci)); > > > > > > This assert should be moved to the txq_complete(), or it should not be > > > an assert. > > txq_complete() is a common function, so this can't force to spare at least > > one > > slot in a completion queue. This assert is to force to allocate enough CQE > > slots > > by accurate calculation as completion is suppressed by MLX5_TX_COMP_THRESH. > > If > > the CQ size is well defined (e.g. size of Tx ring / MLX5_TX_COMP_THRESH), it > > doesn't need to check deficiency of CQ slot but checking slots in Tx ring > > (max_elts) is sufficient. If you are okay with this, please let me know, > > then > > I'll send out v3. > > Just using your comment... > /* A CQE slot must always be available. */ > > This is always true in any Tx function where this assumption should be > true to avoid testing it before posting a completion request and thus > avoid cycles waste and this independently of the computation for the CQ > ring size. > > As it is an assert it is only present to help the developer in error > code he developed (or for a user to point an issue in the code), this > can be any-where in the Tx data path. It becomes useful for any Tx > function using this txq_complete(). > > If this assert is only related to your code, it may means it should be > an if which avoids to post a completion request when no slots are > available. I think I was wrong about the assert. If the Tx ring is full (max_elts == 0), then there won't be any empty slot left in CQ. I'll remove the two asserts.
Thanks Yongseok