On 9/26/2024 3:03 PM, Meade, Niall wrote:
>> From: Ferruh Yigit <ferruh.yi...@amd.com>
>> Sent: Thursday, September 26, 2024 12:16 AM
>> To: Meade, Niall <niall.me...@intel.com>; Thomas Monjalon 
>> <tho...@monjalon.net>; Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>; 
>> Roman Zhukov <roman.zhu...@arknetworks.am>
>> Cc: dev@dpdk.org <dev@dpdk.org>
>> Subject: Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic
> <snip>
>>> The resolution involves upcasting nb_desc to a uint32_t before the
>>> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent
>>> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not
>>> result in an overflow, as it would when nb_desc is a uint16_t. By using
>>> a uint32_t for these operations, the correct behavior is maintained
>>> without the risk of overflow.
>>>
>>
>> Hi Niall,
> 
> Hi Ferruh,
> 
>> Thanks for the patch.
>>
>> For the 'RTE_ALIGN_CEIL(val, align)' macro, 'align' should be power of
>> two, as 'desc_lim->nb_align' is uint16_t, max value it can get is 2^15.
>> 'val' should be smaller than or equal to 'align', so '*nb_desc' can be
>> maximum 2^15.
>>
>> So RTE_ALIGN_CEIL(2^15-1, 2^15) = 2^15, I think this should work fine
>> (although I didn't test).
>>
>> And even with your uint32_t cast, I think following will fail:
>> RTE_ALIGN_CEIL(2^16-1, 2^15)
>> (again, not tested).
>>
> 
> I tested my code with these values and the behaviour is as expected from
> what I can see.
> At a high level I ran into this issue when passing uint16_tMAX into
> rte_eth_dev_adjust_nb_rx_tx_desc() with the intent of selecting the maximum
> ring descriptor size but the minimum was selected.
> 
>> Or maybe I am missing a case, can you please give some actual numbers to
>> show the problem and the fix?
> 
> Yes sure! If we take an example of val= (2^16)-1 and align= 32.
> RTE_ALIGN_CEIL(val, align) calls RTE_ALIGN_FLOOR(val + align - 1, align). With
> val as a uint16_t this subsequent macro call results in a wrap around for val
> (originally was the max uint16_t and now we are attempting to add align to
> it). The returned value of RTE_ALIGN_CEIL() in this case is 0. This results in
> nb_desc being set to 0, and later set to the minimum ring descriptor size for
> that NIC with *nb_desc = RTE_MAX(*nb_desc, desc_lim->nb_min).
> 
> While this example is an unreasonably large request for a descriptor ring 
> size,
> the expected behaviour would be that the descriptor ring size defaults back to
> the maximum possible for that particular NIC, not to the minimum which it
> currently does.
> By introducing a uint32_t, the wrap around in RTE_ALIGN_FLOOR() is avoided,
> keeping the large value of nb_desc_32 which is later set to an appropriate 
> size
> in RTE_MIN(*nb_desc_32, desc_lim->nb_max)
> 

I see the problem now, thanks.

When value > (2^16 - align), next aligned value is 2^16, which is
UINT16_MAX + 1, hence wraps to 0, this is kind of expected.

For the relevant code, assuming 'desc_lim->nb_max' & 'desc_lim->nb_min'
are already aligned to 'desc_lim->nb_align', following should fix the
issue, that seems simpler to me, what do you think:

```
if (desc_lim->nb_max != 0)
        *nb_desc = RTE_MIN(*nb_desc, desc_lim->nb_max);

nb_desc_32 = RTE_MAX(nb_desc_32, desc_lim->nb_min);

if (desc_lim->nb_align != 0)
        *nb_desc = RTE_ALIGN_CEIL(*nb_desc, desc_lim->nb_align);
```

Basically just changing the order of the operations...

It is not easy to see the problem, can you please give sample values in
the commit log (for '*nb_desc', 'nb_align', 'nb_max' & 'nb_min'), that
makes much easier to see why above works.

Reply via email to