On 04/02/2020 09:51, David Marchand wrote:
> On Mon, Feb 3, 2020 at 7:56 PM Ray Kinsella <m...@ashroe.eu> wrote:
>> On 03/02/2020 17:34, Thomas Monjalon wrote:
>>> 03/02/2020 18:09, Thomas Monjalon:
>>>> 03/02/2020 10:30, Ferruh Yigit:
>>>>> On 2/2/2020 2:41 PM, Ananyev, Konstantin wrote:
>>>>>> 02/02/2020 14:05, Thomas Monjalon:
>>>>>>> 31/01/2020 15:16, Trahe, Fiona:
>>>>>>>> On 1/30/2020 8:18 PM, Thomas Monjalon wrote:
>>>>>>>>> If library give higher value than expected by the application,
>>>>>>>>> if the application uses this value as array index,
>>>>>>>>> there can be an access out of bounds.
>>>>>>>>
>>>>>>>> [Fiona] All asymmetric APIs are experimental so above shouldn't be a 
>>>>>>>> problem.
>>>>>>>> But for the same issue with sym crypto below, I believe Ferruh's 
>>>>>>>> explanation makes
>>>>>>>> sense and I don't see how there can be an API breakage.
>>>>>>>> So if an application hasn't compiled against the new lib it will be 
>>>>>>>> still using the old value
>>>>>>>> which will be within bounds. If it's picking up the higher new value 
>>>>>>>> from the lib it must
>>>>>>>> have been compiled against the lib so shouldn't have problems.
>>>>>>>
>>>>>>> You say there is no ABI issue because the application will be 
>>>>>>> re-compiled
>>>>>>> for the updated library. Indeed, compilation fixes compatibility issues.
>>>>>>> But this is not relevant for ABI compatibility.
>>>>>>> ABI compatibility means we can upgrade the library without recompiling
>>>>>>> the application and it must work.
>>>>>>> You think it is a false positive because you assume the application
>>>>>>> "picks" the new value. I think you miss the case where the new value
>>>>>>> is returned by a function in the upgraded library.
>>>>>>>
>>>>>>>> There are also no structs on the API which contain arrays using this
>>>>>>>> for sizing, so I don't see an opportunity for an appl to have a
>>>>>>>> mismatch in memory addresses.
>>>>>>>
>>>>>>> Let me demonstrate where the API may "use" the new value
>>>>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 and how it impacts the application.
>>>>>>>
>>>>>>> Once upon a time a DPDK application counting the number of devices
>>>>>>> supporting each AEAD algo (in order to find the best supported algo).
>>>>>>> It is done in an array indexed by algo id:
>>>>>>> int aead_dev_count[RTE_CRYPTO_AEAD_LIST_END];
>>>>>>> The application is compiled with DPDK 19.11,
>>>>>>> where RTE_CRYPTO_AEAD_LIST_END = 3.
>>>>>>> So the size of the application array aead_dev_count is 3.
>>>>>>> This binary is run with DPDK 20.02,
>>>>>>> where RTE_CRYPTO_AEAD_CHACHA20_POLY1305 = 3.
>>>>>>> When calling rte_cryptodev_info_get() on a device QAT_GEN3,
>>>>>>> rte_cryptodev_info.capabilities.sym.aead.algo is set to
>>>>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 (= 3).
>>>>>>> The application uses this value:
>>>>>>> ++ aead_dev_count[info.capabilities.sym.aead.algo];
>>>>>>> The application is crashing because of out of bound access.
>>>>>>
>>>>>> I'd say this is an example of bad written app.
>>>>>> It probably should check that returned by library value doesn't
>>>>>> exceed its internal array size.
>>>>>
>>>>> +1
>>>>>
>>>>> Application should ignore values >= MAX.
>>>>
>>>> Of course, blaming the API user is a lot easier than looking at the API.
>>>> Here the API has RTE_CRYPTO_AEAD_LIST_END which can be understood
>>>> as the max value for the application.
>>>> Value ranges are part of the ABI compatibility contract.
>>>> It seems you expect the application developer to be aware that
>>>> DPDK could return a higher value, so the application should
>>>> check every enum values after calling an API. CRAZY.
>>>>
>>>> When we decide to announce an ABI compatibility and do some marketing,
>>>> everyone is OK. But when we need to really make our ABI compatible,
>>>> I see little or no effort. DISAPPOINTING.
>>>>
>>>>> Do you suggest we don't extend any enum or define between ABI breakage 
>>>>> releases
>>>>> to be sure bad written applications not affected?
>>>>
>>>> I suggest we must consider not breaking any assumption made on the API.
>>>> Here we are breaking the enum range because nothing mentions _LIST_END
>>>> is not really the absolute end of the enum.
>>>> The solution is to make the change below in 20.02 + backport in 19.11.1:
>>>
>>> Thinking twice, merging such change before 20.11 is breaking the
>>> ABI assumption based on the API 19.11.0.
>>> I ask the release maintainers (Luca, Kevin, David and me) and
>>> the ABI maintainers (Neil and Ray) to vote for a or b solution:
>>>       a) add comment and LIST_MAX as below in 20.02 + 19.11.1
>>
>> That would still be an ABI breakage though right.
> 
> Yes.
> 
> 
>>
>>>       b) wait 20.11 and revert Chacha-Poly from 20.02
>>
>> Thanks for analysis above Fiona, Ferruh and all.
>>
>> That is a nasty one alright - there is no "good" answer here.
>> I agree with Ferruh's sentiments overall, we should rethink this API for 
>> 20.11.
>> Could do without an enumeration?
>>
>> There a c) though right.
>> We could work around the issue by api versioning rte_cryptodev_info_get() 
>> and friends.
> 
> It has a lot of friends, but it sounds like the right approach.

+1

> Is someone looking into this?

Looks to be in hand now.

> 
> 
>> So they only support/acknowledge the existence of Chacha-Poly for 
>> applications build against > 20.02.
>>
>> It would be painful I know.
>> It would also mean that Chacha-Poly would only be available to those 
>> building against >= 20.02.
> 
> Yes.
> 
> 
> --
> David Marchand
> 

Reply via email to