On 29/07/2020 14:28, David Marchand wrote:
> On Wed, Jul 29, 2020 at 3:14 PM Dumitrescu, Cristian
> <cristian.dumitre...@intel.com> wrote:
>>> Please correct me if I am wrong, but it simply means this part of the
>>> table library never worked for 32-bit.
>>> It seems more adding 32-bit support rather than a fix and then I
>>> wonder if it has its place in rc3.
>>>
>>
>> Functionally. the code works, but performance is affected.
>>
>> The only thing that prevents the code from working is the check in the table
>> create function that checks the size of the above structure is 64 bytes,
>> which caught this issue.
>
> Yes, and that's my point.
> It was not working.
> It was not tested.
>
>
> This patch asks for backport in stable branches, I will let Kevin and
> Luca comment.
>
It should be in master first, but then it's a fix so I think it can go
to stable if requested and supported by the table maintainer in the
event of any (future) regressions.
>
>>
>>>
>>>
>>> Now, looking at the details:
>>>
>>> For 64-bit on my x86, we have:
>>>
>>> struct rte_bucket_4_8 {
>>> uint64_t signature; /* 0 8 */
>>> uint64_t lru_list; /* 8 8 */
>>> struct rte_bucket_4_8 * next; /* 16 8 */
>>> uint64_t next_valid; /* 24 8 */
>>> uint64_t key[4]; /* 32 32 */
>>> /* --- cacheline 1 boundary (64 bytes) --- */
>>> uint8_t data[]; /* 64 0 */
>>>
>>> /* size: 64, cachelines: 1, members: 6 */
>>> };
>>>
>>>
>>> For 32-bit, we have:
>>>
>>> struct rte_bucket_4_8 {
>>> uint64_t signature; /* 0 8 */
>>> uint64_t lru_list; /* 8 8 */
>>> struct rte_bucket_4_8 * next; /* 16 4 */
>>> uint64_t next_valid; /* 20 8 */
>>> uint64_t key[4]; /* 28 32 */
>>> uint8_t data[]; /* 60 0 */
>>>
>>> /* size: 60, cachelines: 1, members: 6 */
>>> /* last cacheline: 60 bytes */
>>> } __attribute__((__packed__));
>>>
>>> ^^ it is interesting that a packed attribute ends up here.
>>> I saw no such attribute in the library code.
>>> Compiler black magic at work I guess...
>>>
>>
>> Where do you see the packet attribute? I don't see it in the code.
>
> That's pahole reporting this.
> Maybe the tool extrapolates this attribute based on the next_valid
> field placement... I don't know.
>
>> A packet attribute would explain this issue, i.e. why did the compiler
>> decide not to insert an expected padfing of 4 bytes right after the "next"
>> field, that would allow the field "next_valid" to be aligned to its natural
>> boundary of 8 bytes.
>
> Or a 64-bit field on 32-bit has a special alignment that I am not aware of.
>
>
>>
>>>
>>>>
>>>> Fixes: 8aa327214c ("table: hash")
>>>> Cc: sta...@dpdk.org
>>>>
>>>> Signed-off-by: Ting Xu <ting...@intel.com>
>>>>
>>>> ---
>>>> v3->v4: Change design based on comment
>>>> v2->v3: Rebase
>>>> v1->v2: Correct patch time
>>>> ---
>>>> lib/librte_table/rte_table_hash_key16.c | 17 +++++++++++++++++
>>>> lib/librte_table/rte_table_hash_key32.c | 17 +++++++++++++++++
>>>> lib/librte_table/rte_table_hash_key8.c | 16 ++++++++++++++++
>>>> 3 files changed, 50 insertions(+)
>>>>
>>>> diff --git a/lib/librte_table/rte_table_hash_key16.c
>>> b/lib/librte_table/rte_table_hash_key16.c
>>>> index 2cca1c924..c4384b114 100644
>>>> --- a/lib/librte_table/rte_table_hash_key16.c
>>>> +++ b/lib/librte_table/rte_table_hash_key16.c
>>>> @@ -33,6 +33,7 @@
>>>>
>>>> #endif
>>>>
>>>> +#ifdef RTE_ARCH_64
>>>> struct rte_bucket_4_16 {
>>>> /* Cache line 0 */
>>>> uint64_t signature[4 + 1];
>>>> @@ -46,6 +47,22 @@ struct rte_bucket_4_16 {
>>>> /* Cache line 2 */
>>>> uint8_t data[0];
>>>> };
>>>> +#else
>>>> +struct rte_bucket_4_16 {
>>>> + /* Cache line 0 */
>>>> + uint64_t signature[4 + 1];
>>>> + uint64_t lru_list;
>>>> + struct rte_bucket_4_16 *next;
>>>> + uint32_t pad;
>>>> + uint64_t next_valid;
>>>> +
>>>> + /* Cache line 1 */
>>>> + uint64_t key[4][2];
>>>> +
>>>> + /* Cache line 2 */
>>>> + uint8_t data[0];
>>>> +};
>>>> +#endif
>>>
>>> The change could simply be:
>>>
>>> @@ -38,6 +38,9 @@ struct rte_bucket_4_16 {
>>> uint64_t signature[4 + 1];
>>> uint64_t lru_list;
>>> struct rte_bucket_4_16 *next;
>>> +#ifndef RTE_ARCH_64
>>> + uint32_t pad;
>>> +#endif
>>> uint64_t next_valid;
>>>
>>> /* Cache line 1 */
>>>
>>> It avoids duplicating the whole structure definition (we could miss
>>> updating one side of the #ifdef later).
>>> Idem for the other "8" and "32" structures.
>
>
> What about this comment?
>
>