On 9/15/2021 9:33 AM, Ruifeng Wang wrote: > Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates > that the rest of the descriptor words have valid values. Hence, the > word containing DD bit must be read first before reading the rest of > the descriptor words. > > Since the entire descriptor is not read atomically, on relaxed memory > ordered systems like Aarch64, read of the word containing DD field > could be reordered after read of other words. > > Read barrier is inserted between read of the word with DD field > and read of other words. The barrier ensures that the fetched data > is correct. > > Testpmd single core test showed no performance drop on x86 or N1SDP. > On ThunderX2, 22% performance regression was observed. >
Is 22% performance drop value correct? That is a big drop, is it acceptable? Is this performance drop valid for all Arm scalar datapath, or is it specific to ThunderX2? > Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform") > Cc: sta...@dpdk.org > > Signed-off-by: Ruifeng Wang <ruifeng.w...@arm.com> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>