Hi Ferruh, > > > -----Original Message----- > > > From: Yigit, Ferruh <ferruh.yi...@intel.com> > > > Sent: Tuesday, October 12, 2021 12:27 AM > > > To: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Ruifeng > > Wang > > > <ruifeng.w...@arm.com>; dev@dpdk.org; Min Hu (Connor) > > > <humi...@huawei.com>; Yisen Zhuang <yisen.zhu...@huawei.com>; > > Lijun Ou > > > <ouli...@huawei.com> > > > Cc: Xing, Beilei <beilei.x...@intel.com>; Zhang, Qi Z > > > <qi.z.zh...@intel.com>; Richardson, Bruce > > > <bruce.richard...@intel.com>; jer...@marvell.com; > > > hemant.agra...@nxp.com; d...@linux.vnet.ibm.com; sta...@dpdk.org; > nd > > > <n...@arm.com>; humi...@huawei.com > > > Subject: Re: [dpdk-stable] [PATCH v2 2/2] net/i40e: fix risk in Rx > > > descriptor read in scalar path > > > > > > On 9/29/2021 4:29 PM, Honnappa Nagarahalli wrote: > > > > <snip> > > > >> > > > >> On 9/15/2021 9:33 AM, Ruifeng Wang wrote: > > > >>> Rx descriptor is 16B/32B in size. If the DD bit is set, it > > > >>> indicates that the rest of the descriptor words have valid values. > > > >>> Hence, the word containing DD bit must be read first before > > > >>> reading the rest of the descriptor words. > > > >>> > > > >>> Since the entire descriptor is not read atomically, on relaxed > > > >>> memory ordered systems like Aarch64, read of the word containing > > > >>> DD field could be reordered after read of other words. > > > >>> > > > >>> Read barrier is inserted between read of the word with DD field > > > >>> and read of other words. The barrier ensures that the fetched > > > >>> data is correct. > > > >>> > > > >>> Testpmd single core test showed no performance drop on x86 or > > N1SDP. > > > >>> On ThunderX2, 22% performance regression was observed. > > > >>> > > > >> > > > >> Is 22% performance drop value correct? That is a big drop, is it > > acceptable? > > > > Agree, it is a big drop. Fixing it will require using the barrier less > frequently. > > > For ex: read 4 descriptors (4 words containing the DD bits) before > > > using the barrier. > > > > > > > >> > > > >> Is this performance drop valid for all Arm scalar datapath, or is > > > >> it specific to ThunderX2? > > > > This is specific to ThunderX2. N1 CPU does not see any impact. A72 > > > > is not > > > tested. Considering that the ThunderXx line of CPUs are not in > > > further development, and it is scalar path, I would not suggest to > > > make further changes to the code. > > > > > > > > It would be good to test this on Kunpeng servers and get some > feedback. > > > > > > Hi Connor, Yisen, Lijun, > > > > > > Can you please check this patch? I don't know if you are using i40e > > > nic on your platform but if you do can you please test it? > > > > > > Overall this patch cause a big performance drop on Arm for i40e, I > > > just want to be sure this is not impacting any user negatively. I cannot speak for vendors. But my test on a Huawei aarch64 server showed no performance drop. NIC in use is XXV710. Just FYI.
Thanks, Ruifeng > > > > Folks: > > This patch has been dropped from dpdk-next-net-intel, as still > > waiting for your confirm. > > Btw Patch 1/2 was still in dpdk-next-net-intel. > > Thanks > > Qi > > > Hi Qi, Ferruh, > > Do you have any suggestion on how to progress this patch? > It is fixing possible violation of hardware access from architecture point of > view. > Negative performance impact may happen because barriers are added. > I don't think we received objections until now. > > Thanks, > Ruifeng > > > > > > > > > > >> > > > >>> Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform") > > > >>> Cc: sta...@dpdk.org > > > >>> > > > >>> Signed-off-by: Ruifeng Wang <ruifeng.w...@arm.com> > > > >>> Reviewed-by: Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com> > > > >