Hi, Jiawei > -----Original Message----- > From: Jiawei Zhu <17826875...@163.com> > Sent: Monday, March 22, 2021 17:46 > To: dev@dpdk.org > Cc: zhujiawe...@huawei.com; Matan Azrad <ma...@nvidia.com>; Shahaf > Shuler <shah...@nvidia.com>; Slava Ovsiienko <viachesl...@nvidia.com> > Subject: [PATCH] net/mlx5: add Rx checksum offload flag return bad > > From: Jiawei Zhu <zhujiawe...@huawei.com> > > When open the rx checksum offload and receive the wrong checksum, add > the ol_flags return bad. And it's not best to use multiplication and division > here.
I'm sorry, there should be no any multiplications and divisions - the arguments are constants (can be determined at compilation time) and ara power of 2, hence compiler engages simple shifts. For example (I did rxq_cq_to_ol_flags not inline to get the listing for x86-64): 29 rxq_cq_to_ol_flags: 21 /* 22 * An architecture-optimized byte swap for a 16-bit value. 23 * 24 * Do not use this function directly. The preferred function is rte_bswap16(). 25 */ 26 static inline uint16_t rte_arch_bswap16(uint16_t _x) 27 { 28 uint16_t x = _x; 29 0034 86D6 asm volatile ("xchgb %b[x1],%h[x2]" 30 : [x1] "=Q" (x) 37 38 0036 89D0 movl %edx,%eax 39 0038 6681E200 andw $512,%dx 39 02 40 003d 66250004 andw $1024,%ax 41 0041 0FB7D2 movzwl %dx,%edx 42 0044 0FB7C0 movzwl %ax,%eax 43 0047 48C1EA02 shrq $2,%rdx 44 004b 48C1E802 shrq $2,%rax 45 004f 09D0 orl %edx,%eax 46 0051 C3 ret As we can see - there is no any mul/div and no any branches, that improves the performance. > > Signed-off-by: Jiawei Zhu <zhujiawe...@huawei.com> > --- > drivers/net/mlx5/mlx5_rxtx.c | 17 ++++++++++------- > drivers/net/mlx5/mlx5_utils.h | 6 ------ > 2 files changed, 10 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c > index e3ce9fd..9233af8 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.c > +++ b/drivers/net/mlx5/mlx5_rxtx.c > @@ -1325,13 +1325,16 @@ enum mlx5_txcmp_code { > uint32_t ol_flags = 0; > uint16_t flags = rte_be_to_cpu_16(cqe->hdr_type_etc); > > - ol_flags = > - TRANSPOSE(flags, > - MLX5_CQE_RX_L3_HDR_VALID, > - PKT_RX_IP_CKSUM_GOOD) | > - TRANSPOSE(flags, > - MLX5_CQE_RX_L4_HDR_VALID, > - PKT_RX_L4_CKSUM_GOOD); > + if (flags & MLX5_CQE_RX_L3_HDR_VALID) > + ol_flags |= PKT_RX_IP_CKSUM_GOOD; > + else > + ol_flags |= PKT_RX_IP_CKSUM_BAD; > + If MLX5_CQE_RX_L3_HDR_VALID is not set - it does not always mean the sum is bad. If might happen if HW just did not recognize the packet format (for example, for some tunnels) With best regards, Slava > + if (flags & MLX5_CQE_RX_L4_HDR_VALID) > + ol_flags |= PKT_RX_IP_CKSUM_GOOD; > + else > + ol_flags |= PKT_RX_IP_CKSUM_BAD; > + > return ol_flags; > } > > diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h > index 7a62187..2f71a23 100644 > --- a/drivers/net/mlx5/mlx5_utils.h > +++ b/drivers/net/mlx5/mlx5_utils.h > @@ -44,12 +44,6 @@ > #define NB_SEGS(m) ((m)->nb_segs) > #define PORT(m) ((m)->port) > > -/* Transpose flags. Useful to convert IBV to DPDK flags. */ -#define > TRANSPOSE(val, from, to) \ > - (((from) >= (to)) ? \ > - (((val) & (from)) / ((from) / (to))) : \ > - (((val) & (from)) * ((to) / (from)))) > - > /* > * For the case which data is linked with sequence increased index, the > * array table will be more efficiect than hash table once need to serarch > -- > 1.8.3.1 >