On 04/02/2020 10:24, Thomas Monjalon wrote:
> RED FLAG
> 
> I don't see a lot of reactions, so I summarize the issue.
> We need action TODAY!
> 
> API makes think that rte_cryptodev_info_get() cannot return
> a value >= 3 (RTE_CRYPTO_AEAD_LIST_END in 19.11).
> Current 20.02 returns 3 (RTE_CRYPTO_AEAD_CHACHA20_POLY1305).
> The ABI compatibility contract is broken currently.
> 
> There are 3 possible outcomes:
> 
> a) Change the API comments and backport to 19.11.1
> The details are discussed between Ferruh and me.
> Either put responsibility on API user (with explicit comment),
> or expose ABI extension allowance with a new API max value.
> In both cases, this is breaking the implicit contract of 19.11.0.
> This option can be chosen only if release and ABI maintainers
> vote for it.
> 
> b) Revert Chacha-Poly from 20.02-rc2.
> 
> c) Add versioned function rte_cryptodev_info_get_v1911()
> which calls rte_cryptodev_info_get() and filters out
> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 capability.
> So Chacha-Poly capability would be seen and usable only
> if compiling with DPDK 20.02.
> 

Maybe a separate version of rte_cryptodev_get_aead_algo_enum() also
needed to handle chacha string differently.

> I hope it is clear what are the actions for everybody:
> - ABI and release maintainers must say yes or no to the proposal (a)

My 2c for a) is No.

> - In the meantime, crypto team must send a patch for the proposal (c)
> - If (a) and (c) are not possible at the end of today, I will take (b)
> 
> Note: do not say it is too short for (c), as it was possible to work
> on such solution since the issue was exposed on last Wednesday.
> 

Could it be reverted today if necessary and re-added later in the
release cycle? It seems like something modular that should not
invalidate earlier testing.

> 
> 03/02/2020 22:07, Thomas Monjalon:
>> 03/02/2020 19:55, Ray Kinsella:
>>> On 03/02/2020 17:34, Thomas Monjalon wrote:
>>>> 03/02/2020 18:09, Thomas Monjalon:
>>>>> 03/02/2020 10:30, Ferruh Yigit:
>>>>>> On 2/2/2020 2:41 PM, Ananyev, Konstantin wrote:
>>>>>>> 02/02/2020 14:05, Thomas Monjalon:
>>>>>>>> 31/01/2020 15:16, Trahe, Fiona:
>>>>>>>>> On 1/30/2020 8:18 PM, Thomas Monjalon wrote:
>>>>>>>>>> If library give higher value than expected by the application,
>>>>>>>>>> if the application uses this value as array index,
>>>>>>>>>> there can be an access out of bounds.
>>>>>>>>>
>>>>>>>>> [Fiona] All asymmetric APIs are experimental so above shouldn't be a 
>>>>>>>>> problem.
>>>>>>>>> But for the same issue with sym crypto below, I believe Ferruh's 
>>>>>>>>> explanation makes
>>>>>>>>> sense and I don't see how there can be an API breakage.
>>>>>>>>> So if an application hasn't compiled against the new lib it will be 
>>>>>>>>> still using the old value
>>>>>>>>> which will be within bounds. If it's picking up the higher new value 
>>>>>>>>> from the lib it must
>>>>>>>>> have been compiled against the lib so shouldn't have problems.
>>>>>>>>
>>>>>>>> You say there is no ABI issue because the application will be 
>>>>>>>> re-compiled
>>>>>>>> for the updated library. Indeed, compilation fixes compatibility 
>>>>>>>> issues.
>>>>>>>> But this is not relevant for ABI compatibility.
>>>>>>>> ABI compatibility means we can upgrade the library without recompiling
>>>>>>>> the application and it must work.
>>>>>>>> You think it is a false positive because you assume the application
>>>>>>>> "picks" the new value. I think you miss the case where the new value
>>>>>>>> is returned by a function in the upgraded library.
>>>>>>>>
>>>>>>>>> There are also no structs on the API which contain arrays using this
>>>>>>>>> for sizing, so I don't see an opportunity for an appl to have a
>>>>>>>>> mismatch in memory addresses.
>>>>>>>>
>>>>>>>> Let me demonstrate where the API may "use" the new value
>>>>>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 and how it impacts the application.
>>>>>>>>
>>>>>>>> Once upon a time a DPDK application counting the number of devices
>>>>>>>> supporting each AEAD algo (in order to find the best supported algo).
>>>>>>>> It is done in an array indexed by algo id:
>>>>>>>> int aead_dev_count[RTE_CRYPTO_AEAD_LIST_END];
>>>>>>>> The application is compiled with DPDK 19.11,
>>>>>>>> where RTE_CRYPTO_AEAD_LIST_END = 3.
>>>>>>>> So the size of the application array aead_dev_count is 3.
>>>>>>>> This binary is run with DPDK 20.02,
>>>>>>>> where RTE_CRYPTO_AEAD_CHACHA20_POLY1305 = 3.
>>>>>>>> When calling rte_cryptodev_info_get() on a device QAT_GEN3,
>>>>>>>> rte_cryptodev_info.capabilities.sym.aead.algo is set to
>>>>>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 (= 3).
>>>>>>>> The application uses this value:
>>>>>>>> ++ aead_dev_count[info.capabilities.sym.aead.algo];
>>>>>>>> The application is crashing because of out of bound access.
>>>>>>>
>>>>>>> I'd say this is an example of bad written app.
>>>>>>> It probably should check that returned by library value doesn't
>>>>>>> exceed its internal array size.
>>>>>>
>>>>>> +1
>>>>>>
>>>>>> Application should ignore values >= MAX.
>>>>>
>>>>> Of course, blaming the API user is a lot easier than looking at the API.
>>>>> Here the API has RTE_CRYPTO_AEAD_LIST_END which can be understood
>>>>> as the max value for the application.
>>>>> Value ranges are part of the ABI compatibility contract.
>>>>> It seems you expect the application developer to be aware that
>>>>> DPDK could return a higher value, so the application should
>>>>> check every enum values after calling an API. CRAZY.
>>>>>
>>>>> When we decide to announce an ABI compatibility and do some marketing,
>>>>> everyone is OK. But when we need to really make our ABI compatible,
>>>>> I see little or no effort. DISAPPOINTING.
>>>>>
>>>>>> Do you suggest we don't extend any enum or define between ABI breakage 
>>>>>> releases
>>>>>> to be sure bad written applications not affected?
>>>>>
>>>>> I suggest we must consider not breaking any assumption made on the API.
>>>>> Here we are breaking the enum range because nothing mentions _LIST_END
>>>>> is not really the absolute end of the enum.
>>>>> The solution is to make the change below in 20.02 + backport in 19.11.1:
>>>>
>>>> Thinking twice, merging such change before 20.11 is breaking the
>>>> ABI assumption based on the API 19.11.0.
>>>> I ask the release maintainers (Luca, Kevin, David and me) and
>>>> the ABI maintainers (Neil and Ray) to vote for a or b solution:
>>>>    a) add comment and LIST_MAX as below in 20.02 + 19.11.1
>>>
>>> That would still be an ABI breakage though right.
>>>
>>>>    b) wait 20.11 and revert Chacha-Poly from 20.02
>>>
>>> Thanks for analysis above Fiona, Ferruh and all. 
>>>
>>> That is a nasty one alright - there is no "good" answer here.
>>> I agree with Ferruh's sentiments overall, we should rethink this API for 
>>> 20.11. 
>>> Could do without an enumeration?
>>>
>>> There a c) though right.
>>> We could work around the issue by api versioning rte_cryptodev_info_get() 
>>> and friends.
>>> So they only support/acknowledge the existence of Chacha-Poly for 
>>> applications build against > 20.02.
>>
>> I agree there is a c) as I proposed in another email:
>> http://mails.dpdk.org/archives/dev/2020-February/156919.html
>> "
>> In this case, the proper solution is to implement
>> rte_cryptodev_info_get_v1911() so it filters out
>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 capability.
>> With this solution, an application compiled with DPDK 19.11 will keep
>> seeing the same range as before, while a 20.02 application could
>> see and use ChachaPoly.
>> "
>>
>>> It would be painful I know.
>>
>> Not so painful in my opinion.
>> Just need to call rte_cryptodev_info_get() from
>> rte_cryptodev_info_get_v1911() and filter the value
>> in the 19.11 range: [0..AES_GCM].
>>
>>> It would also mean that Chacha-Poly would only be available to
>>> those building against >= 20.02.
>>
>> Yes exactly.
>>
>> The addition of comments and LIST_MAX like below are still valid
>> to avoid versioning after 20.11.
>>
>>>>> - _LIST_END
>>>>> + _LIST_END, /* an ABI-compatible version may increase this value */
>>>>> + _LIST_MAX = _LIST_END + 42 /* room for ABI-compatible additions */
>>>>> };
>>>>>
>>>>> Then *_LIST_END values could be ignored by libabigail with such a change.
>>
>> In order to avoid ABI check complaining, the best is to completely
>> remove LIST_END in DPDK 20.11.
>>
>>
>>>>> If such a patch is not done by tomorrow, I will have to revert
>>>>> Chacha-Poly commits before 20.02-rc2, because
>>>>>
>>>>> 1/ LIST_END, without any comment, means "size of range"
>>>>> 2/ we do not blame users for undocumented ABI changes
>>>>> 3/ we respect the ABI compatibility contract
> 
> 
> 

Reply via email to