[dpdk-dev] Questions about API with no parameter check
Tyler Retzlaff
roretzla at linux.microsoft.com
Fri Apr 30 02:15:31 CEST 2021
On Thu, Apr 29, 2021 at 09:49:24PM +0300, Dmitry Kozlyuk wrote:
> 2021-04-29 09:16 (UTC-0700), Tyler Retzlaff:
> > On Wed, Apr 07, 2021 at 05:10:00PM +0100, Ferruh Yigit wrote:
> > > On 4/7/2021 4:25 PM, Hemant Agrawal wrote:
> > > >>+1
> > > >>But are we going to check all parameters?
> > > >
> > > >+1
> > > >
> > > >It may be better to limit the number of checks.
> > > >
> > >
> > > +1 to verify input for APIs.
> > >
> > > Why not do all, what is the downside of checking all input for control path APIs?
> >
> > why not assert them then, what is the purpose of returning an error to a
> > caller for a api contract violation like a `parameter shall not be NULL`
> >
> > * assert.h/cassert can be compiled away for those pundits who don't want
> > to see extra branches in their code
> >
> > * when not compiled away it gives you an immediate stack trace or dump to operate
> > on immediately identifying the problem instead of having to troll
> > through hoaky inconsistently formatted logging.
> >
> > * it catches callers who don't bother to check for error from return of
> > the function (debug builds) instead of some arbitrary failure at some
> > unrelated part of the code where the corrupted program state is relied
> > upon.
> >
> > we aren't running in kernel, we can crash.
>
> As library developers we can't assume stability requirements at call site.
> There may be temporary files to clean up, for example,
> or other threads in the middle of their work.
if a callers state is so incoherent that it is passing NULL to functions
that contractually expect non-NULL it is already way past the point of
no return. continuing to run only accomplishes destroying the state that
might be used to diagnose the originating flaw in program logic.
if you return an error instead of fail fast at best you'll crash soon but
more often then not you'll keep running and produce incorrect results or worst
keep running security compromised.
about the only argument that can be made for having this silly error
pattern that is valid is when many-party code is running inside the same
process and you don't want someone elses bad code taking your process
down. a problem that i am accutely aware of in allowing 3rd party code run
in kernel space. (but this is mostly? mitigated by multi-process mode).
> As an application developer I'd hate to get a crash inside a library and
> having to debug it. Usually installed are release versions with assertions
> compiled away.
>
so it wouldn't crash at all at least not at the point of failure. the only
difference is i guess you wouldn't get a log message with what is being done
now.
could we turn this around and have it tunable by policy instead of
opting everyone in to this behavior maybe? i'm just making some ideas up on
the fly but couldn't we just have something that is compile time policy?
#ifdef EAL_FAILURE_POLICY_RETURN
#define EAL_FAILURE(condition, error) \
if ((condition)) { \
return (error); \
}
#else
#define EAL_FAILURE(condition, error) \
assert(! (condition), (error));
#endif
also, i'll point out that lately there have been a lot of patches
accepted that call functions and don't evaluate their return value and
the reason is those functions really should never have been "failable".
so we'll just see more of that as we stack on often compile time or
immediate runtime failure returns. of course the compatibility of the
code calling these functions is only as good as the implicit dependency
on the implementation... until it changes and the application
misbehaves.
i'll also throw another gripe in here that there are a lot of
"deallocation" functions in dpdk that according to their api can fail
again because of this kind of "oh i'll fail because i got a bad
parameter design".
deallocation should never fail ever and i shouldn't need to write logic
around a deallocation to handle failures. imagine if free failed?
p = malloc(...);
if (p == NULL)
return -1;
... do work with p ...
rv = free(p);
if (rv != 0) ... what the hell? yet this pattern exists in a bunch of
places. it's insane. (i'll quietly ignore the design error that free
does accept NULL and is a noop standardized *facepalm*).
anyway, i guess i've ranted enough. there are some users who would
prefer not to have this but i admit there are an overwhelming number of
people who seem to want it.
More information about the dev
mailing list