[dpdk-dev] [PATCH v2 4/4] add ABI checks

Thomas Monjalon thomas at monjalon.net
Tue Feb 4 10:45:31 CET 2020


04/02/2020 10:19, Ferruh Yigit:
> On 2/3/2020 6:40 PM, Thomas Monjalon wrote:
> > 03/02/2020 18:40, Ferruh Yigit:
> >> On 2/3/2020 5:09 PM, Thomas Monjalon wrote:
> >>> 03/02/2020 10:30, Ferruh Yigit:
> >>>> On 2/2/2020 2:41 PM, Ananyev, Konstantin wrote:
> >>>>> 02/02/2020 14:05, Thomas Monjalon:
> >>>>>> 31/01/2020 15:16, Trahe, Fiona:
> >>>>>>> On 1/30/2020 8:18 PM, Thomas Monjalon wrote:
> >>>>>>>> 30/01/2020 17:09, Ferruh Yigit:
> >>>>>>>>> On 1/29/2020 8:13 PM, Akhil Goyal wrote:
> >>>>>>>>>>
> >>>>>>>>>> I believe these enums will be used only in case of ASYM case which is experimental.
> >>>>>>>>>
> >>>>>>>>> Independent from being experiment and not, this shouldn't be a problem, I think
> >>>>>>>>> this is a false positive.
> >>>>>>>>>
> >>>>>>>>> The ABI break can happen when a struct has been shared between the application
> >>>>>>>>> and the library (DPDK) and the layout of that memory know differently by
> >>>>>>>>> application and the library.
> >>>>>>>>>
> >>>>>>>>> Here in all cases, there is no layout/size change.
> >>>>>>>>>
> >>>>>>>>> As to the value changes of the enums, since application compiled with old DPDK,
> >>>>>>>>> it will know only up to '6', 7 and more means invalid to the application. So it
> >>>>>>>>> won't send these values also it should ignore these values from library. Only
> >>>>>>>>> consequence is old application won't able to use new features those new enums
> >>>>>>>>> provide but that is expected/normal.
> >>>>>>>>
> >>>>>>>> If library give higher value than expected by the application,
> >>>>>>>> if the application uses this value as array index,
> >>>>>>>> there can be an access out of bounds.
> >>>>>>>
> >>>>>>> [Fiona] All asymmetric APIs are experimental so above shouldn't be a problem.
> >>>>>>> But for the same issue with sym crypto below, I believe Ferruh's explanation makes
> >>>>>>> sense and I don't see how there can be an API breakage.
> >>>>>>> So if an application hasn't compiled against the new lib it will be still using the old value
> >>>>>>> which will be within bounds. If it's picking up the higher new value from the lib it must
> >>>>>>> have been compiled against the lib so shouldn't have problems.
> >>>>>>
> >>>>>> You say there is no ABI issue because the application will be re-compiled
> >>>>>> for the updated library. Indeed, compilation fixes compatibility issues.
> >>>>>> But this is not relevant for ABI compatibility.
> >>>>>> ABI compatibility means we can upgrade the library without recompiling
> >>>>>> the application and it must work.
> >>>>>> You think it is a false positive because you assume the application
> >>>>>> "picks" the new value. I think you miss the case where the new value
> >>>>>> is returned by a function in the upgraded library.
> >>>>>>
> >>>>>>> There are also no structs on the API which contain arrays using this
> >>>>>>> for sizing, so I don't see an opportunity for an appl to have a
> >>>>>>> mismatch in memory addresses.
> >>>>>>
> >>>>>> Let me demonstrate where the API may "use" the new value
> >>>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 and how it impacts the application.
> >>>>>>
> >>>>>> Once upon a time a DPDK application counting the number of devices
> >>>>>> supporting each AEAD algo (in order to find the best supported algo).
> >>>>>> It is done in an array indexed by algo id:
> >>>>>> int aead_dev_count[RTE_CRYPTO_AEAD_LIST_END];
> >>>>>> The application is compiled with DPDK 19.11,
> >>>>>> where RTE_CRYPTO_AEAD_LIST_END = 3.
> >>>>>> So the size of the application array aead_dev_count is 3.
> >>>>>> This binary is run with DPDK 20.02,
> >>>>>> where RTE_CRYPTO_AEAD_CHACHA20_POLY1305 = 3.
> >>>>>> When calling rte_cryptodev_info_get() on a device QAT_GEN3,
> >>>>>> rte_cryptodev_info.capabilities.sym.aead.algo is set to
> >>>>>> RTE_CRYPTO_AEAD_CHACHA20_POLY1305 (= 3).
> >>>>>> The application uses this value:
> >>>>>> ++ aead_dev_count[info.capabilities.sym.aead.algo];
> >>>>>> The application is crashing because of out of bound access.
> >>>>>
> >>>>> I'd say this is an example of bad written app.
> >>>>> It probably should check that returned by library value doesn't
> >>>>> exceed its internal array size.
> >>>>
> >>>> +1
> >>>>
> >>>> Application should ignore values >= MAX.
> >>>
> >>> Of course, blaming the API user is a lot easier than looking at the API.
> >>> Here the API has RTE_CRYPTO_AEAD_LIST_END which can be understood
> >>> as the max value for the application.
> >>> Value ranges are part of the ABI compatibility contract.
> >>> It seems you expect the application developer to be aware that
> >>> DPDK could return a higher value, so the application should
> >>> check every enum values after calling an API. CRAZY.
> >>>
> >>> When we decide to announce an ABI compatibility and do some marketing,
> >>> everyone is OK. But when we need to really make our ABI compatible,
> >>> I see little or no effort. DISAPPOINTING.
> >>
> >> This is not to blame the user or to do less work, this is more sane approach
> >> that library provides the _END/_MAX value and application uses it as valid range
> >> check.
> >>
> >>>> Do you suggest we don't extend any enum or define between ABI breakage releases
> >>>> to be sure bad written applications not affected?
> >>>
> >>> I suggest we must consider not breaking any assumption made on the API.
> >>> Here we are breaking the enum range because nothing mentions _LIST_END
> >>> is not really the absolute end of the enum.
> >>> The solution is to make the change below in 20.02 + backport in 19.11.1:
> >>>
> >>> - _LIST_END
> >>> + _LIST_END, /* an ABI-compatible version may increase this value */
> >>> + _LIST_MAX = _LIST_END + 42 /* room for ABI-compatible additions */
> >>> };
> >>>
> >>
> >> What is the point of "_LIST_MAX" here?
> > 
> > _LIST_MAX is range of value that DPDK can return in the ABI contract.
> > So the appplication can rely on the range 0.._LIST_MAX.
> > 
> >> Application should know the "_LIST_END" of when it has been compiled for the
> >> valid range check. Next time it is compiled "_LIST_END" may be different value
> >> but same logic applies.
> > 
> > No, ABI compatibility contract means you can compile your application
> > with DPDK 19.11.0 and run it with DPDK 20.02.
> > So _LIST_END comes from 19.11 and does not include ChachaPoly.
> 
> That is what I mean, let me try to give a sample.
> 
> DPDK19.11 returns, A=1, B=2, END=3
> 
> Application compiled with DPDK19.11, it will process A, B and ignore anything ">= 3"

No, the application will not ignore anything ">=3" as I explained above,
and you blamed the application for it.
Nothing in the API says the application must filter value higher than 3,
because as of now, values higher than 3 are PMD bug.


> DPDK20.02 returns A=1, B=2, C=3, D=4, END=5
> 
> Old application will still only will know/use A, B and can ignore when library
> sends C=3, D=4 etc...
> 
> 
> In above, if you add another limit as you suggested, like MAX=10 and ask
> application to use it,
> 
> Application compiled with DPDK19.11 will be OK since library only sends A,B and
> application uses them.
> 
> But with DPDK20.02 application may have problem, since library will be sending
> C=3, which is valid according to the check " <= MAX (10)", how application will
> know to ignore it.

Why application should ignore value C=3 with DPDK 20.02?


> So application should use _END to know the valid ones according it, if so what
> is the point of having _MAX.
> 
> 
> >> When "_LIST_END" is missing, application can't protect itself, in that case
> >> library should send only the values application knows when it is compiled, this
> >> means either we can't extend our enum/defines until next ABI breakage, or we
> >> need to do ABI versioning to the functions that returns an enum each time enum
> >> value extended.
> > 
> > If we define _LIST_MAX as a bigger value than current _LIST_END,
> > we have some room to add values in between.
> > 
> > If (as of now) we don't have _LIST_MAX room, then yes we must version
> > the functions returning the enum.
> > In this case, the proper solution is to implement
> > rte_cryptodev_info_get_v1911() so it filters out
> > RTE_CRYPTO_AEAD_CHACHA20_POLY1305 capability.
> > With this solution, an application compiled with DPDK 19.11 will keep
> > seeing the same range as before, while a 20.02 application could
> > see and use ChachaPoly.
> > This is another proposal that I was expecting from the crypto team,
> > instead of claiming there is no issue (and wasting precious time).
> > 
> > 
> >> I believe it is saner to provide _END/_MAX values to the application to use. And
> >> if required comment them to clarify the expected usage.
> >>
> >> But in above suggestion application can't use or rely on "_LIST_MAX", it doesn't
> >> mean anything to application.
> > 
> > I don't understand what you mean. I think you misunderstood what is ABI compat.
> > 
> > 
> >>> Then *_LIST_END values could be ignored by libabigail with such a change.
> >>>
> >>> If such a patch is not done by tomorrow, I will have to revert
> >>> Chacha-Poly commits before 20.02-rc2, because
> >>>
> >>> 1/ LIST_END, without any comment, means "size of range"
> >>> 2/ we do not blame users for undocumented ABI changes
> >>> 3/ we respect the ABI compatibility contract





More information about the dev mailing list