Hi Wenwu, On 2024/3/15 17:27, Ma, WenwuX wrote: > Hi Chengwen > >> -----Original Message----- >> From: fengchengwen <fengcheng...@huawei.com> >> Sent: Friday, March 15, 2024 4:32 PM >> To: Ma, WenwuX <wenwux...@intel.com>; dev@dpdk.org >> Cc: Jiale, SongX <songx.ji...@intel.com>; sta...@dpdk.org >> Subject: Re: [PATCH v2] dmadev: fix structure alignment >> >> Hi Wenwu, >> >> On 2024/3/15 15:44, Ma, WenwuX wrote: >>> Hi Chengwen, >>> >>>> -----Original Message----- >>>> From: Ma, WenwuX >>>> Sent: Friday, March 15, 2024 2:26 PM >>>> To: fengchengwen <fengcheng...@huawei.com>; dev@dpdk.org >>>> Cc: Jiale, SongX <songx.ji...@intel.com>; sta...@dpdk.org >>>> Subject: RE: [PATCH v2] dmadev: fix structure alignment >>>> >>>> Hi Chengwen, >>>> >>>>> -----Original Message----- >>>>> From: fengchengwen <fengcheng...@huawei.com> >>>>> Sent: Friday, March 15, 2024 2:06 PM >>>>> To: Ma, WenwuX <wenwux...@intel.com>; dev@dpdk.org >>>>> Cc: Jiale, SongX <songx.ji...@intel.com>; sta...@dpdk.org >>>>> Subject: Re: [PATCH v2] dmadev: fix structure alignment >>>>> >>>>> Hi Wenwu, >>>>> >>>>> On 2024/3/15 9:43, Wenwu Ma wrote: >>>>>> The structure rte_dma_dev needs only 8 byte alignment. >>>>>> This patch replaces __rte_cache_aligned of rte_dma_dev with >>>>>> __rte_aligned(8). >>>>>> >>>>>> Fixes: b36970f2e13e ("dmadev: introduce DMA device library") >>>>>> Cc: sta...@dpdk.org >>>>>> >>>>>> Signed-off-by: Wenwu Ma <wenwux...@intel.com> >>>>>> --- >>>>>> v2: >>>>>> - Because of performance drop, adjust the code to >>>>>> no longer demand cache line alignment >>>>> >>>>> Which two versions observed performance drop? And which benchmark >>>>> observed drop? >>>>> Could you provide more information? >>>>> >>>>>> >>>> V1 patch: >>>> >> https://patches.dpdk.org/project/dpdk/patch/20240308053711.1260154- >>>> 1-wenwux...@intel.com/ >>>> >>>> To view detailed results, visit: >>>> https://lab.dpdk.org/results/dashboard/patchsets/29472/ >>>> >>>>>> --- >>>>>> lib/dmadev/rte_dmadev_pmd.h | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/lib/dmadev/rte_dmadev_pmd.h >>>>> b/lib/dmadev/rte_dmadev_pmd.h >>>>>> index 58729088ff..b569bb3502 100644 >>>>>> --- a/lib/dmadev/rte_dmadev_pmd.h >>>>>> +++ b/lib/dmadev/rte_dmadev_pmd.h >>>>>> @@ -122,7 +122,7 @@ enum rte_dma_dev_state { >>>>>> * @internal >>>>>> * The generic data structure associated with each DMA device. >>>>>> */ >>>>>> -struct __rte_cache_aligned rte_dma_dev { >>>>>> +struct __rte_aligned(8) rte_dma_dev { >>>>> >>>>> The DMA fast-path was implemented by struct rte_dma_fp_objs, which >>>>> is not rte_dma_dev? So why is it a problem here? >>>>> >>>>> Thanks >>>>> >>>> The DMA device object is expected to align cache line, so clang will >>>> use “vmovaps” assembly instruction, >>>> >>>> And the instruction demands 16 bytes alignment or will cause segment >>>> fault in some environments. >>>> >>> Test case: >>> 1. compile dpdk >>> rm -rf x86_64-native-linuxapp-clang >>> CC=clang meson -Denable_kmods=True -Dlibdir=lib >>> --default-library=static x86_64-native-linuxapp-clang ninja -C >>> x86_64-native-linuxapp-clang -j 72 2. start dpdk-test >>> /root/dpdk/x86_64-native-linuxapp-clang/app/dpdk-test -l 0-39 >>> --vdev=dma_skeleton -a 31:00.0 -a 31:00.1 -a 31:00.2 -a 31:00.3 (Note: >>> If it cannot be reproduced, please try using a different core) >>> 3. exit dpdk-test >>> RTE>>quit >>> Segmentation fault (core dumped)
I reproduce it just with --vdev=dma_skeleton. When execute quit command, it will invoke rte_dma_close->dma_release, pls see my annotations (//) below: void dma_release(struct rte_dma_dev *dev) { if (rte_eal_process_type() == RTE_PROC_PRIMARY) { rte_free(dev->data->dev_private); memset(dev->data, 0, sizeof(struct rte_dma_dev_data)); } dma_fp_object_dummy(dev->fp_obj); memset(dev, 0, sizeof(struct rte_dma_dev)); // this memset was compiles using vmovaps, its // 8c24da: c5 f8 57 c0 vxorps %xmm0,%xmm0,%xmm0 // 8c24de: c5 fc 29 43 20 vmovaps %ymm0,0x20(%rbx) // 8c24e3: c5 fc 29 03 vmovaps %ymm0,(%rbx) // but the dev is not align 16B (in my env the rte_dma_devices addr is 0x15d39950) } >> >> I will try to reproduce, but still a question: does above test has already >> merged >> your patch [1] or the current main branch code has this problem? >> >> [1] >> https://patches.dpdk.org/project/dpdk/patch/20240308053711.1260154- >> 1-wenwux...@intel.com/ >> >> Thanks >> > the current main branch code has this problem. > > Both patch v1 and v2 are able to solve this problem, but v1 has a performance > issue. The performance issue is ethdev benchmark, it will not invoke any dmadev API, I don't think these two has any relations. So I prefer v1, Plus Pavan also submit a commit [1] to align the struct, but it was not a fix for clang-x86-platform. [1] https://lore.kernel.org/all/20240210062758.1510-1-pbhagavat...@marvell.com/T/ > >>> >>>> >>>>>> /** Device info which supplied during device initialization. */ >>>>>> struct rte_device *device; >>>>>> struct rte_dma_dev_data *data; /**< Pointer to shared device >>>>>> data. >>>>>> */ >>>>>> What more, could you please send v3? I hope it will contain the root cause and optional solutions of the segment fault problem. BTW: dmadev is the first one which dynamic alloc dmadev struct, later maybe more xxxdev will use this type, I think that's typical. Maybe we should add a such mem_align() function in eal library, but this could done later. Thanks