[dpdk-dev] [PATCH v2 4/4] add ABI checks

Aaron Conole aconole at redhat.com
Tue Feb 4 15:44:53 CET 2020


Thomas Monjalon <thomas at monjalon.net> writes:

> RED FLAG
>
> I don't see a lot of reactions, so I summarize the issue.
> We need action TODAY!
>
> API makes think that rte_cryptodev_info_get() cannot return
> a value >= 3 (RTE_CRYPTO_AEAD_LIST_END in 19.11).
> Current 20.02 returns 3 (RTE_CRYPTO_AEAD_CHACHA20_POLY1305).
> The ABI compatibility contract is broken currently.
>
> There are 3 possible outcomes:
>
> a) Change the API comments and backport to 19.11.1
> The details are discussed between Ferruh and me.
> Either put responsibility on API user (with explicit comment),
> or expose ABI extension allowance with a new API max value.
> In both cases, this is breaking the implicit contract of 19.11.0.
> This option can be chosen only if release and ABI maintainers
> vote for it.
>
> b) Revert Chacha-Poly from 20.02-rc2.
>
> c) Add versioned function rte_cryptodev_info_get_v1911()
> which calls rte_cryptodev_info_get() and filters out
> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 capability.
> So Chacha-Poly capability would be seen and usable only
> if compiling with DPDK 20.02.
>
> I hope it is clear what are the actions for everybody:
> - ABI and release maintainers must say yes or no to the proposal (a)
> - In the meantime, crypto team must send a patch for the proposal (c)
> - If (a) and (c) are not possible at the end of today, I will take (b)
>
> Note: do not say it is too short for (c), as it was possible to work
> on such solution since the issue was exposed on last Wednesday.

While I'm not a maintainer, if I my opinion counts for anything, I'd
choose option c or b.  Absolutely NACK to a.

>
> 03/02/2020 22:07, Thomas Monjalon:
>> 03/02/2020 19:55, Ray Kinsella:
>> > On 03/02/2020 17:34, Thomas Monjalon wrote:
>> > > 03/02/2020 18:09, Thomas Monjalon:
>> > >> 03/02/2020 10:30, Ferruh Yigit:
>> > >>> On 2/2/2020 2:41 PM, Ananyev, Konstantin wrote:
>> > >>>> 02/02/2020 14:05, Thomas Monjalon:
>> > >>>>> 31/01/2020 15:16, Trahe, Fiona:
>> > >>>>>> On 1/30/2020 8:18 PM, Thomas Monjalon wrote:
>> > >>>>>>> If library give higher value than expected by the application,
>> > >>>>>>> if the application uses this value as array index,
>> > >>>>>>> there can be an access out of bounds.
>> > >>>>>>
>> > >>>>>> [Fiona] All asymmetric APIs are experimental so above shouldn't be a problem.
>> > >>>>>> But for the same issue with sym crypto below, I believe Ferruh's explanation makes
>> > >>>>>> sense and I don't see how there can be an API breakage.
>> > >>>>>> So if an application hasn't compiled against the new lib it
>> > >>>>>> will be still using the old value
>> > >>>>>> which will be within bounds. If it's picking up the higher
>> > >>>>>> new value from the lib it must
>> > >>>>>> have been compiled against the lib so shouldn't have problems.
>> > >>>>>
>> > >>>>> You say there is no ABI issue because the application will be re-compiled
>> > >>>>> for the updated library. Indeed, compilation fixes compatibility issues.
>> > >>>>> But this is not relevant for ABI compatibility.
>> > >>>>> ABI compatibility means we can upgrade the library without recompiling
>> > >>>>> the application and it must work.
>> > >>>>> You think it is a false positive because you assume the application
>> > >>>>> "picks" the new value. I think you miss the case where the new value
>> > >>>>> is returned by a function in the upgraded library.
>> > >>>>>
>> > >>>>>> There are also no structs on the API which contain arrays using this
>> > >>>>>> for sizing, so I don't see an opportunity for an appl to have a
>> > >>>>>> mismatch in memory addresses.
>> > >>>>>
>> > >>>>> Let me demonstrate where the API may "use" the new value
>> > >>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 and how it impacts the application.
>> > >>>>>
>> > >>>>> Once upon a time a DPDK application counting the number of devices
>> > >>>>> supporting each AEAD algo (in order to find the best supported algo).
>> > >>>>> It is done in an array indexed by algo id:
>> > >>>>> int aead_dev_count[RTE_CRYPTO_AEAD_LIST_END];
>> > >>>>> The application is compiled with DPDK 19.11,
>> > >>>>> where RTE_CRYPTO_AEAD_LIST_END = 3.
>> > >>>>> So the size of the application array aead_dev_count is 3.
>> > >>>>> This binary is run with DPDK 20.02,
>> > >>>>> where RTE_CRYPTO_AEAD_CHACHA20_POLY1305 = 3.
>> > >>>>> When calling rte_cryptodev_info_get() on a device QAT_GEN3,
>> > >>>>> rte_cryptodev_info.capabilities.sym.aead.algo is set to
>> > >>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 (= 3).
>> > >>>>> The application uses this value:
>> > >>>>> ++ aead_dev_count[info.capabilities.sym.aead.algo];
>> > >>>>> The application is crashing because of out of bound access.
>> > >>>>
>> > >>>> I'd say this is an example of bad written app.
>> > >>>> It probably should check that returned by library value doesn't
>> > >>>> exceed its internal array size.
>> > >>>
>> > >>> +1
>> > >>>
>> > >>> Application should ignore values >= MAX.
>> > >>
>> > >> Of course, blaming the API user is a lot easier than looking at the API.
>> > >> Here the API has RTE_CRYPTO_AEAD_LIST_END which can be understood
>> > >> as the max value for the application.
>> > >> Value ranges are part of the ABI compatibility contract.
>> > >> It seems you expect the application developer to be aware that
>> > >> DPDK could return a higher value, so the application should
>> > >> check every enum values after calling an API. CRAZY.
>> > >>
>> > >> When we decide to announce an ABI compatibility and do some marketing,
>> > >> everyone is OK. But when we need to really make our ABI compatible,
>> > >> I see little or no effort. DISAPPOINTING.
>> > >>
>> > >>> Do you suggest we don't extend any enum or define between ABI breakage releases
>> > >>> to be sure bad written applications not affected?
>> > >>
>> > >> I suggest we must consider not breaking any assumption made on the API.
>> > >> Here we are breaking the enum range because nothing mentions _LIST_END
>> > >> is not really the absolute end of the enum.
>> > >> The solution is to make the change below in 20.02 + backport in 19.11.1:
>> > > 
>> > > Thinking twice, merging such change before 20.11 is breaking the
>> > > ABI assumption based on the API 19.11.0.
>> > > I ask the release maintainers (Luca, Kevin, David and me) and
>> > > the ABI maintainers (Neil and Ray) to vote for a or b solution:
>> > > 	a) add comment and LIST_MAX as below in 20.02 + 19.11.1
>> > 
>> > That would still be an ABI breakage though right.
>> > 
>> > > 	b) wait 20.11 and revert Chacha-Poly from 20.02
>> > 
>> > Thanks for analysis above Fiona, Ferruh and all. 
>> > 
>> > That is a nasty one alright - there is no "good" answer here.
>> > I agree with Ferruh's sentiments overall, we should rethink this API for 20.11. 
>> > Could do without an enumeration?
>> > 
>> > There a c) though right.
>> > We could work around the issue by api versioning rte_cryptodev_info_get() and friends.
>> > So they only support/acknowledge the existence of Chacha-Poly for
>> > applications build against > 20.02.
>> 
>> I agree there is a c) as I proposed in another email:
>> http://mails.dpdk.org/archives/dev/2020-February/156919.html
>> "
>> In this case, the proper solution is to implement
>> rte_cryptodev_info_get_v1911() so it filters out
>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 capability.
>> With this solution, an application compiled with DPDK 19.11 will keep
>> seeing the same range as before, while a 20.02 application could
>> see and use ChachaPoly.
>> "
>> 
>> > It would be painful I know.
>> 
>> Not so painful in my opinion.
>> Just need to call rte_cryptodev_info_get() from
>> rte_cryptodev_info_get_v1911() and filter the value
>> in the 19.11 range: [0..AES_GCM].
>> 
>> > It would also mean that Chacha-Poly would only be available to
>> > those building against >= 20.02.
>> 
>> Yes exactly.
>> 
>> The addition of comments and LIST_MAX like below are still valid
>> to avoid versioning after 20.11.
>> 
>> > >> - _LIST_END
>> > >> + _LIST_END, /* an ABI-compatible version may increase this value */
>> > >> + _LIST_MAX = _LIST_END + 42 /* room for ABI-compatible additions */
>> > >> };
>> > >>
>> > >> Then *_LIST_END values could be ignored by libabigail with such a change.
>> 
>> In order to avoid ABI check complaining, the best is to completely
>> remove LIST_END in DPDK 20.11.
>> 
>> 
>> > >> If such a patch is not done by tomorrow, I will have to revert
>> > >> Chacha-Poly commits before 20.02-rc2, because
>> > >>
>> > >> 1/ LIST_END, without any comment, means "size of range"
>> > >> 2/ we do not blame users for undocumented ABI changes
>> > >> 3/ we respect the ABI compatibility contract



More information about the dev mailing list