Re: [dpdk-dev] [dpdk-stable] [PATCH v4] lib/table: fix cache alignment issue

Kevin Traynor Thu, 30 Jul 2020 03:36:20 -0700

On 29/07/2020 14:28, David Marchand wrote:
> On Wed, Jul 29, 2020 at 3:14 PM Dumitrescu, Cristian
> <[email protected]> wrote:
>>> Please correct me if I am wrong, but it simply means this part of the
>>> table library never worked for 32-bit.
>>> It seems more adding 32-bit support rather than a fix and then I
>>> wonder if it has its place in rc3.
>>>
>>
>> Functionally. the code works, but performance is affected.
>>
>> The only thing that prevents the code from working is the check in the table 
>> create function that checks the size of the above structure is 64 bytes, 
>> which caught this issue.
> 
> Yes, and that's my point.
> It was not working.
> It was not tested.
> 
> 
> This patch asks for backport in stable branches, I will let Kevin and
> Luca comment.
> 

It should be in master first, but then it's a fix so I think it can go
to stable if requested and supported by the table maintainer in the
event of any (future) regressions.

> 
>>
>>>
>>>
>>> Now, looking at the details:
>>>
>>> For 64-bit on my x86, we have:
>>>
>>> struct rte_bucket_4_8 {
>>>     uint64_t                   signature;            /*     0     8 */
>>>     uint64_t                   lru_list;             /*     8     8 */
>>>     struct rte_bucket_4_8 *    next;                 /*    16     8 */
>>>     uint64_t                   next_valid;           /*    24     8 */
>>>     uint64_t                   key[4];               /*    32    32 */
>>>     /* --- cacheline 1 boundary (64 bytes) --- */
>>>     uint8_t                    data[];               /*    64     0 */
>>>
>>>     /* size: 64, cachelines: 1, members: 6 */
>>> };
>>>
>>>
>>> For 32-bit, we have:
>>>
>>> struct rte_bucket_4_8 {
>>>     uint64_t                   signature;            /*     0     8 */
>>>     uint64_t                   lru_list;             /*     8     8 */
>>>     struct rte_bucket_4_8 *    next;                 /*    16     4 */
>>>     uint64_t                   next_valid;           /*    20     8 */
>>>     uint64_t                   key[4];               /*    28    32 */
>>>     uint8_t                    data[];               /*    60     0 */
>>>
>>>     /* size: 60, cachelines: 1, members: 6 */
>>>     /* last cacheline: 60 bytes */
>>> } __attribute__((__packed__));
>>>
>>> ^^ it is interesting that a packed attribute ends up here.
>>> I saw no such attribute in the library code.
>>> Compiler black magic at work I guess...
>>>
>>
>> Where do you see the packet attribute? I don't see it in the code.
> 
> That's pahole reporting this.
> Maybe the tool extrapolates this attribute based on the next_valid
> field placement... I don't know.
> 
>> A packet attribute would explain this issue, i.e. why did the compiler 
>> decide not to insert an expected padfing of 4 bytes right after the "next" 
>> field, that would allow the field "next_valid" to be aligned to its natural 
>> boundary of 8 bytes.
> 
> Or a 64-bit field on 32-bit has a special alignment that I am not aware of.
> 
> 
>>
>>>
>>>>
>>>> Fixes: 8aa327214c ("table: hash")
>>>> Cc: [email protected]
>>>>
>>>> Signed-off-by: Ting Xu <[email protected]>
>>>>
>>>> ---
>>>> v3->v4: Change design based on comment
>>>> v2->v3: Rebase
>>>> v1->v2: Correct patch time
>>>> ---
>>>>  lib/librte_table/rte_table_hash_key16.c | 17 +++++++++++++++++
>>>>  lib/librte_table/rte_table_hash_key32.c | 17 +++++++++++++++++
>>>>  lib/librte_table/rte_table_hash_key8.c  | 16 ++++++++++++++++
>>>>  3 files changed, 50 insertions(+)
>>>>
>>>> diff --git a/lib/librte_table/rte_table_hash_key16.c
>>> b/lib/librte_table/rte_table_hash_key16.c
>>>> index 2cca1c924..c4384b114 100644
>>>> --- a/lib/librte_table/rte_table_hash_key16.c
>>>> +++ b/lib/librte_table/rte_table_hash_key16.c
>>>> @@ -33,6 +33,7 @@
>>>>
>>>>  #endif
>>>>
>>>> +#ifdef RTE_ARCH_64
>>>>  struct rte_bucket_4_16 {
>>>>         /* Cache line 0 */
>>>>         uint64_t signature[4 + 1];
>>>> @@ -46,6 +47,22 @@ struct rte_bucket_4_16 {
>>>>         /* Cache line 2 */
>>>>         uint8_t data[0];
>>>>  };
>>>> +#else
>>>> +struct rte_bucket_4_16 {
>>>> +       /* Cache line 0 */
>>>> +       uint64_t signature[4 + 1];
>>>> +       uint64_t lru_list;
>>>> +       struct rte_bucket_4_16 *next;
>>>> +       uint32_t pad;
>>>> +       uint64_t next_valid;
>>>> +
>>>> +       /* Cache line 1 */
>>>> +       uint64_t key[4][2];
>>>> +
>>>> +       /* Cache line 2 */
>>>> +       uint8_t data[0];
>>>> +};
>>>> +#endif
>>>
>>> The change could simply be:
>>>
>>> @@ -38,6 +38,9 @@ struct rte_bucket_4_16 {
>>>         uint64_t signature[4 + 1];
>>>         uint64_t lru_list;
>>>         struct rte_bucket_4_16 *next;
>>> +#ifndef RTE_ARCH_64
>>> +       uint32_t pad;
>>> +#endif
>>>         uint64_t next_valid;
>>>
>>>         /* Cache line 1 */
>>>
>>> It avoids duplicating the whole structure definition (we could miss
>>> updating one side of the #ifdef later).
>>> Idem for the other "8" and "32" structures.
> 
> 
> What about this comment?
> 
>
Re: [dpdk-dev] [dpdk-stable] [PATCH v4] lib/table: fix cache alignment issue

Reply via email to