> -----Original Message----- > From: Ruifeng Wang > Sent: Wednesday, June 29, 2022 7:41 PM > To: Slava Ovsiienko <viachesl...@nvidia.com>; Ali Alnubani > <alia...@nvidia.com>; Matan > Azrad <ma...@nvidia.com> > Cc: dev@dpdk.org; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; > sta...@dpdk.org; nd > <n...@arm.com>; nd <n...@arm.com>; nd <n...@arm.com> > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON vector > path > > > -----Original Message----- > > From: Slava Ovsiienko <viachesl...@nvidia.com> > > Sent: Wednesday, June 29, 2022 3:55 PM > > To: Ruifeng Wang <ruifeng.w...@arm.com>; Ali Alnubani > > <alia...@nvidia.com>; Matan Azrad <ma...@nvidia.com> > > Cc: dev@dpdk.org; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; > > sta...@dpdk.org; nd <n...@arm.com>; nd <n...@arm.com> > > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON > > vector path > > > > Hi, Ruifeng > > > > > -----Original Message----- > > > From: Ruifeng Wang <ruifeng.w...@arm.com> > > > Sent: Monday, June 27, 2022 14:08 > > > To: Slava Ovsiienko <viachesl...@nvidia.com>; Ali Alnubani > > > <alia...@nvidia.com>; Matan Azrad <ma...@nvidia.com> > > > Cc: dev@dpdk.org; Honnappa Nagarahalli > > <honnappa.nagaraha...@arm.com>; > > > sta...@dpdk.org; nd <n...@arm.com>; nd <n...@arm.com> > > > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in > > > NEON vector path > > > > > > > -----Original Message----- > > > > From: Slava Ovsiienko <viachesl...@nvidia.com> > > > > Sent: Monday, June 20, 2022 1:38 PM > > > > To: Ali Alnubani <alia...@nvidia.com>; Ruifeng Wang > > > > <ruifeng.w...@arm.com>; Matan Azrad <ma...@nvidia.com> > > > > Cc: dev@dpdk.org; Honnappa Nagarahalli > > > > <honnappa.nagaraha...@arm.com>; sta...@dpdk.org; nd <n...@arm.com> > > > > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in > > > > NEON vector path > > > > > > > > Hi, Ruifeng > > > > > > Hi Slava, > > > > > > Thanks for your review. > > > > > > > > My apologies for review delay. > > > > > > Apologies too. I was on something else. > > > > > > > As far I understand the hypothetical problem scenario is: > > > > - CPU core reorders reading of qwords of 16B vector > > > > - core reads the second 8B of CQE (old CQE values) > > > > - CQE update > > > > - core reads the first 8B of CQE (new CQE values) > > > > > > Yes, This is the problem. > > > > > > > > How the re-reading of CQEs can resolve the issue? > > > > This wrong scenario might happen on the second read and we would > > > > run into the same issue. > > > > > > Here we are trying to ordering reading of a 16B vector (8B with > > > op_own > > > - high, and 8B without op_own - low). > > > The first read will load 16B. The second read will load and update > > > low 8B (no op_own). > > OK, I got the point, thank you for the explanations. > > Can we avoid the first reading of low 8B (no containing CQE owning field)? > > > > I mean to update this part to read only upper 8Bs: > > /* B.0 (CQE 3) load a block having op_own. */ > > c3 = vld1q_u64((uint64_t *)(p3 + 48)); > > /* B.0 (CQE 2) load a block having op_own. */ > > c2 = vld1q_u64((uint64_t *)(p2 + 48)); > > /* B.0 (CQE 1) load a block having op_own. */ > > c1 = vld1q_u64((uint64_t *)(p1 + 48)); > > /* B.0 (CQE 0) load a block having op_own. */ > > c0 = vld1q_u64((uint64_t *)(p0 + 48)); > > /* Synchronize for loading the rest of blocks. */ > > rte_io_rmb(); > > > > Because lower 8Bs will be overlapped with the second read (in your > > patch) and barrier ensures the correct order. > > Hi Slava, > > Yes, your suggestion is valid. > Actually, I tried that approach: load higher 8B + barrier + load lower 8B + > combine the > two 8Bs into a vector. > It also has no observable performance impact but generates more instructions > compared to > the current patch (the 'combine' operation). > So I followed current approach. > > Thanks. > > Hi Slava,
Are there any further comments? Thanks, Ruifeng