[dpdk-dev] [PATCH v2 4/4] add ABI checks

Neil Horman nhorman at tuxdriver.com
Tue Feb 4 20:49:19 CET 2020


On Tue, Feb 04, 2020 at 09:44:53AM -0500, Aaron Conole wrote:
> Thomas Monjalon <thomas at monjalon.net> writes:
> 
> > RED FLAG
> >
> > I don't see a lot of reactions, so I summarize the issue.
> > We need action TODAY!
> >
> > API makes think that rte_cryptodev_info_get() cannot return
> > a value >= 3 (RTE_CRYPTO_AEAD_LIST_END in 19.11).
> > Current 20.02 returns 3 (RTE_CRYPTO_AEAD_CHACHA20_POLY1305).
> > The ABI compatibility contract is broken currently.
> >
> > There are 3 possible outcomes:
> >
> > a) Change the API comments and backport to 19.11.1
> > The details are discussed between Ferruh and me.
> > Either put responsibility on API user (with explicit comment),
> > or expose ABI extension allowance with a new API max value.
> > In both cases, this is breaking the implicit contract of 19.11.0.
> > This option can be chosen only if release and ABI maintainers
> > vote for it.
> >
> > b) Revert Chacha-Poly from 20.02-rc2.
> >
> > c) Add versioned function rte_cryptodev_info_get_v1911()
> > which calls rte_cryptodev_info_get() and filters out
> > RTE_CRYPTO_AEAD_CHACHA20_POLY1305 capability.
> > So Chacha-Poly capability would be seen and usable only
> > if compiling with DPDK 20.02.
> >
> > I hope it is clear what are the actions for everybody:
> > - ABI and release maintainers must say yes or no to the proposal (a)
> > - In the meantime, crypto team must send a patch for the proposal (c)
> > - If (a) and (c) are not possible at the end of today, I will take (b)
> >
> > Note: do not say it is too short for (c), as it was possible to work
> > on such solution since the issue was exposed on last Wednesday.
> 
> While I'm not a maintainer, if I my opinion counts for anything, I'd
> choose option c or b.  Absolutely NACK to a.
> 
Agreed, options c and b are reasonable, a isn't.  ABI commitments are ours, not
users.

Neil

> >
> > 03/02/2020 22:07, Thomas Monjalon:
> >> 03/02/2020 19:55, Ray Kinsella:
> >> > On 03/02/2020 17:34, Thomas Monjalon wrote:
> >> > > 03/02/2020 18:09, Thomas Monjalon:
> >> > >> 03/02/2020 10:30, Ferruh Yigit:
> >> > >>> On 2/2/2020 2:41 PM, Ananyev, Konstantin wrote:
> >> > >>>> 02/02/2020 14:05, Thomas Monjalon:
> >> > >>>>> 31/01/2020 15:16, Trahe, Fiona:
> >> > >>>>>> On 1/30/2020 8:18 PM, Thomas Monjalon wrote:
> >> > >>>>>>> If library give higher value than expected by the application,
> >> > >>>>>>> if the application uses this value as array index,
> >> > >>>>>>> there can be an access out of bounds.
> >> > >>>>>>
> >> > >>>>>> [Fiona] All asymmetric APIs are experimental so above shouldn't be a problem.
> >> > >>>>>> But for the same issue with sym crypto below, I believe Ferruh's explanation makes
> >> > >>>>>> sense and I don't see how there can be an API breakage.
> >> > >>>>>> So if an application hasn't compiled against the new lib it
> >> > >>>>>> will be still using the old value
> >> > >>>>>> which will be within bounds. If it's picking up the higher
> >> > >>>>>> new value from the lib it must
> >> > >>>>>> have been compiled against the lib so shouldn't have problems.
> >> > >>>>>
> >> > >>>>> You say there is no ABI issue because the application will be re-compiled
> >> > >>>>> for the updated library. Indeed, compilation fixes compatibility issues.
> >> > >>>>> But this is not relevant for ABI compatibility.
> >> > >>>>> ABI compatibility means we can upgrade the library without recompiling
> >> > >>>>> the application and it must work.
> >> > >>>>> You think it is a false positive because you assume the application
> >> > >>>>> "picks" the new value. I think you miss the case where the new value
> >> > >>>>> is returned by a function in the upgraded library.
> >> > >>>>>
> >> > >>>>>> There are also no structs on the API which contain arrays using this
> >> > >>>>>> for sizing, so I don't see an opportunity for an appl to have a
> >> > >>>>>> mismatch in memory addresses.
> >> > >>>>>
> >> > >>>>> Let me demonstrate where the API may "use" the new value
> >> > >>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 and how it impacts the application.
> >> > >>>>>
> >> > >>>>> Once upon a time a DPDK application counting the number of devices
> >> > >>>>> supporting each AEAD algo (in order to find the best supported algo).
> >> > >>>>> It is done in an array indexed by algo id:
> >> > >>>>> int aead_dev_count[RTE_CRYPTO_AEAD_LIST_END];
> >> > >>>>> The application is compiled with DPDK 19.11,
> >> > >>>>> where RTE_CRYPTO_AEAD_LIST_END = 3.
> >> > >>>>> So the size of the application array aead_dev_count is 3.
> >> > >>>>> This binary is run with DPDK 20.02,
> >> > >>>>> where RTE_CRYPTO_AEAD_CHACHA20_POLY1305 = 3.
> >> > >>>>> When calling rte_cryptodev_info_get() on a device QAT_GEN3,
> >> > >>>>> rte_cryptodev_info.capabilities.sym.aead.algo is set to
> >> > >>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 (= 3).
> >> > >>>>> The application uses this value:
> >> > >>>>> ++ aead_dev_count[info.capabilities.sym.aead.algo];
> >> > >>>>> The application is crashing because of out of bound access.
> >> > >>>>
> >> > >>>> I'd say this is an example of bad written app.
> >> > >>>> It probably should check that returned by library value doesn't
> >> > >>>> exceed its internal array size.
> >> > >>>
> >> > >>> +1
> >> > >>>
> >> > >>> Application should ignore values >= MAX.
> >> > >>
> >> > >> Of course, blaming the API user is a lot easier than looking at the API.
> >> > >> Here the API has RTE_CRYPTO_AEAD_LIST_END which can be understood
> >> > >> as the max value for the application.
> >> > >> Value ranges are part of the ABI compatibility contract.
> >> > >> It seems you expect the application developer to be aware that
> >> > >> DPDK could return a higher value, so the application should
> >> > >> check every enum values after calling an API. CRAZY.
> >> > >>
> >> > >> When we decide to announce an ABI compatibility and do some marketing,
> >> > >> everyone is OK. But when we need to really make our ABI compatible,
> >> > >> I see little or no effort. DISAPPOINTING.
> >> > >>
> >> > >>> Do you suggest we don't extend any enum or define between ABI breakage releases
> >> > >>> to be sure bad written applications not affected?
> >> > >>
> >> > >> I suggest we must consider not breaking any assumption made on the API.
> >> > >> Here we are breaking the enum range because nothing mentions _LIST_END
> >> > >> is not really the absolute end of the enum.
> >> > >> The solution is to make the change below in 20.02 + backport in 19.11.1:
> >> > > 
> >> > > Thinking twice, merging such change before 20.11 is breaking the
> >> > > ABI assumption based on the API 19.11.0.
> >> > > I ask the release maintainers (Luca, Kevin, David and me) and
> >> > > the ABI maintainers (Neil and Ray) to vote for a or b solution:
> >> > > 	a) add comment and LIST_MAX as below in 20.02 + 19.11.1
> >> > 
> >> > That would still be an ABI breakage though right.
> >> > 
> >> > > 	b) wait 20.11 and revert Chacha-Poly from 20.02
> >> > 
> >> > Thanks for analysis above Fiona, Ferruh and all. 
> >> > 
> >> > That is a nasty one alright - there is no "good" answer here.
> >> > I agree with Ferruh's sentiments overall, we should rethink this API for 20.11. 
> >> > Could do without an enumeration?
> >> > 
> >> > There a c) though right.
> >> > We could work around the issue by api versioning rte_cryptodev_info_get() and friends.
> >> > So they only support/acknowledge the existence of Chacha-Poly for
> >> > applications build against > 20.02.
> >> 
> >> I agree there is a c) as I proposed in another email:
> >> http://mails.dpdk.org/archives/dev/2020-February/156919.html
> >> "
> >> In this case, the proper solution is to implement
> >> rte_cryptodev_info_get_v1911() so it filters out
> >> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 capability.
> >> With this solution, an application compiled with DPDK 19.11 will keep
> >> seeing the same range as before, while a 20.02 application could
> >> see and use ChachaPoly.
> >> "
> >> 
> >> > It would be painful I know.
> >> 
> >> Not so painful in my opinion.
> >> Just need to call rte_cryptodev_info_get() from
> >> rte_cryptodev_info_get_v1911() and filter the value
> >> in the 19.11 range: [0..AES_GCM].
> >> 
> >> > It would also mean that Chacha-Poly would only be available to
> >> > those building against >= 20.02.
> >> 
> >> Yes exactly.
> >> 
> >> The addition of comments and LIST_MAX like below are still valid
> >> to avoid versioning after 20.11.
> >> 
> >> > >> - _LIST_END
> >> > >> + _LIST_END, /* an ABI-compatible version may increase this value */
> >> > >> + _LIST_MAX = _LIST_END + 42 /* room for ABI-compatible additions */
> >> > >> };
> >> > >>
> >> > >> Then *_LIST_END values could be ignored by libabigail with such a change.
> >> 
> >> In order to avoid ABI check complaining, the best is to completely
> >> remove LIST_END in DPDK 20.11.
> >> 
> >> 
> >> > >> If such a patch is not done by tomorrow, I will have to revert
> >> > >> Chacha-Poly commits before 20.02-rc2, because
> >> > >>
> >> > >> 1/ LIST_END, without any comment, means "size of range"
> >> > >> 2/ we do not blame users for undocumented ABI changes
> >> > >> 3/ we respect the ABI compatibility contract
> 
> 


More information about the dev mailing list