Hi Wenwu, On 2024/3/15 15:44, Ma, WenwuX wrote: > Hi Chengwen, > >> -----Original Message----- >> From: Ma, WenwuX >> Sent: Friday, March 15, 2024 2:26 PM >> To: fengchengwen <fengcheng...@huawei.com>; dev@dpdk.org >> Cc: Jiale, SongX <songx.ji...@intel.com>; sta...@dpdk.org >> Subject: RE: [PATCH v2] dmadev: fix structure alignment >> >> Hi Chengwen, >> >>> -----Original Message----- >>> From: fengchengwen <fengcheng...@huawei.com> >>> Sent: Friday, March 15, 2024 2:06 PM >>> To: Ma, WenwuX <wenwux...@intel.com>; dev@dpdk.org >>> Cc: Jiale, SongX <songx.ji...@intel.com>; sta...@dpdk.org >>> Subject: Re: [PATCH v2] dmadev: fix structure alignment >>> >>> Hi Wenwu, >>> >>> On 2024/3/15 9:43, Wenwu Ma wrote: >>>> The structure rte_dma_dev needs only 8 byte alignment. >>>> This patch replaces __rte_cache_aligned of rte_dma_dev with >>>> __rte_aligned(8). >>>> >>>> Fixes: b36970f2e13e ("dmadev: introduce DMA device library") >>>> Cc: sta...@dpdk.org >>>> >>>> Signed-off-by: Wenwu Ma <wenwux...@intel.com> >>>> --- >>>> v2: >>>> - Because of performance drop, adjust the code to >>>> no longer demand cache line alignment >>> >>> Which two versions observed performance drop? And which benchmark >>> observed drop? >>> Could you provide more information? >>> >>>> >> V1 patch: >> https://patches.dpdk.org/project/dpdk/patch/20240308053711.1260154- >> 1-wenwux...@intel.com/ >> >> To view detailed results, visit: >> https://lab.dpdk.org/results/dashboard/patchsets/29472/ >> >>>> --- >>>> lib/dmadev/rte_dmadev_pmd.h | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/lib/dmadev/rte_dmadev_pmd.h >>> b/lib/dmadev/rte_dmadev_pmd.h >>>> index 58729088ff..b569bb3502 100644 >>>> --- a/lib/dmadev/rte_dmadev_pmd.h >>>> +++ b/lib/dmadev/rte_dmadev_pmd.h >>>> @@ -122,7 +122,7 @@ enum rte_dma_dev_state { >>>> * @internal >>>> * The generic data structure associated with each DMA device. >>>> */ >>>> -struct __rte_cache_aligned rte_dma_dev { >>>> +struct __rte_aligned(8) rte_dma_dev { >>> >>> The DMA fast-path was implemented by struct rte_dma_fp_objs, which is >>> not rte_dma_dev? So why is it a problem here? >>> >>> Thanks >>> >> The DMA device object is expected to align cache line, so clang will use >> “vmovaps” assembly instruction, >> >> And the instruction demands 16 bytes alignment or will cause segment fault in >> some environments. >> > Test case: > 1. compile dpdk > rm -rf x86_64-native-linuxapp-clang > CC=clang meson -Denable_kmods=True -Dlibdir=lib --default-library=static > x86_64-native-linuxapp-clang > ninja -C x86_64-native-linuxapp-clang -j 72 > 2. start dpdk-test > /root/dpdk/x86_64-native-linuxapp-clang/app/dpdk-test -l 0-39 > --vdev=dma_skeleton -a 31:00.0 -a 31:00.1 -a 31:00.2 -a 31:00.3 (Note: If it > cannot be reproduced, please try using a different core) > 3. exit dpdk-test > RTE>>quit > Segmentation fault (core dumped)
I will try to reproduce, but still a question: does above test has already merged your patch [1] or the current main branch code has this problem? [1] https://patches.dpdk.org/project/dpdk/patch/20240308053711.1260154-1-wenwux...@intel.com/ Thanks > >> >>>> /** Device info which supplied during device initialization. */ >>>> struct rte_device *device; >>>> struct rte_dma_dev_data *data; /**< Pointer to shared device data. >>>> */ >>>>