Hi Chengwen, > -----Original Message----- > From: Ma, WenwuX > Sent: Friday, March 15, 2024 2:26 PM > To: fengchengwen <fengcheng...@huawei.com>; dev@dpdk.org > Cc: Jiale, SongX <songx.ji...@intel.com>; sta...@dpdk.org > Subject: RE: [PATCH v2] dmadev: fix structure alignment > > Hi Chengwen, > > > -----Original Message----- > > From: fengchengwen <fengcheng...@huawei.com> > > Sent: Friday, March 15, 2024 2:06 PM > > To: Ma, WenwuX <wenwux...@intel.com>; dev@dpdk.org > > Cc: Jiale, SongX <songx.ji...@intel.com>; sta...@dpdk.org > > Subject: Re: [PATCH v2] dmadev: fix structure alignment > > > > Hi Wenwu, > > > > On 2024/3/15 9:43, Wenwu Ma wrote: > > > The structure rte_dma_dev needs only 8 byte alignment. > > > This patch replaces __rte_cache_aligned of rte_dma_dev with > > > __rte_aligned(8). > > > > > > Fixes: b36970f2e13e ("dmadev: introduce DMA device library") > > > Cc: sta...@dpdk.org > > > > > > Signed-off-by: Wenwu Ma <wenwux...@intel.com> > > > --- > > > v2: > > > - Because of performance drop, adjust the code to > > > no longer demand cache line alignment > > > > Which two versions observed performance drop? And which benchmark > > observed drop? > > Could you provide more information? > > > > > > V1 patch: > https://patches.dpdk.org/project/dpdk/patch/20240308053711.1260154- > 1-wenwux...@intel.com/ > > To view detailed results, visit: > https://lab.dpdk.org/results/dashboard/patchsets/29472/ > > > > --- > > > lib/dmadev/rte_dmadev_pmd.h | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/lib/dmadev/rte_dmadev_pmd.h > > b/lib/dmadev/rte_dmadev_pmd.h > > > index 58729088ff..b569bb3502 100644 > > > --- a/lib/dmadev/rte_dmadev_pmd.h > > > +++ b/lib/dmadev/rte_dmadev_pmd.h > > > @@ -122,7 +122,7 @@ enum rte_dma_dev_state { > > > * @internal > > > * The generic data structure associated with each DMA device. > > > */ > > > -struct __rte_cache_aligned rte_dma_dev { > > > +struct __rte_aligned(8) rte_dma_dev { > > > > The DMA fast-path was implemented by struct rte_dma_fp_objs, which is > > not rte_dma_dev? So why is it a problem here? > > > > Thanks > > > The DMA device object is expected to align cache line, so clang will use > “vmovaps” assembly instruction, > > And the instruction demands 16 bytes alignment or will cause segment fault in > some environments. > Test case: 1. compile dpdk rm -rf x86_64-native-linuxapp-clang CC=clang meson -Denable_kmods=True -Dlibdir=lib --default-library=static x86_64-native-linuxapp-clang ninja -C x86_64-native-linuxapp-clang -j 72 2. start dpdk-test /root/dpdk/x86_64-native-linuxapp-clang/app/dpdk-test -l 0-39 --vdev=dma_skeleton -a 31:00.0 -a 31:00.1 -a 31:00.2 -a 31:00.3 (Note: If it cannot be reproduced, please try using a different core) 3. exit dpdk-test RTE>>quit Segmentation fault (core dumped)
> > > > /** Device info which supplied during device initialization. */ > > > struct rte_device *device; > > > struct rte_dma_dev_data *data; /**< Pointer to shared device data. > > > */ > > >