[dpdk-dev] [PATCHv2] librte_acl make it build/work for 'default' target
nhorman at tuxdriver.com
Fri Aug 8 16:30:07 CEST 2014
On Fri, Aug 08, 2014 at 01:09:34PM +0000, Ananyev, Konstantin wrote:
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Friday, August 08, 2014 1:25 PM
> > To: Ananyev, Konstantin
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCHv2] librte_acl make it build/work for 'default' target
> > On Fri, Aug 08, 2014 at 11:49:58AM +0000, Ananyev, Konstantin wrote:
> > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > Sent: Thursday, August 07, 2014 9:12 PM
> > > > To: Ananyev, Konstantin
> > > > Cc: dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCHv2] librte_acl make it build/work for 'default' target
> > > >
> > > > On Thu, Aug 07, 2014 at 07:31:03PM +0100, Konstantin Ananyev wrote:
> > > > > Make ACL library to build/work on 'default' architecture:
> > > > > - make rte_acl_classify_scalar really scalar
> > > > > (make sure it wouldn't use sse4 instrincts through resolve_priority()).
> > > > > - Provide two versions of rte_acl_classify code path:
> > > > > rte_acl_classify_sse() - could be build and used only on systems with sse4.2
> > > > > and upper, return -ENOTSUP on lower arch.
> > > > > rte_acl_classify_scalar() - a slower version, but could be build and used
> > > > > on all systems.
> > > > > - keep common code shared between these two codepaths.
> > > > >
> > > > > v2 chages:
> > > > > run-time selection of most appropriate code-path for given ISA.
> > > > > By default the highest supprted one is selected.
> > > > > User can still override that selection by manually assigning new value to
> > > > > the global function pointer rte_acl_default_classify.
> > > > > rte_acl_classify() becomes a macro calling whatever rte_acl_default_classify
> > > > > points to.
> > > > >
> > > > >
> > > > > Signed-off-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
> > > >
> > > > This is alot better thank you. A few remaining issues.
> > >
> > > My comments inline too.
> > > Thanks
> > > Konstantin
> > >
> > > >
> > > > > ---
> > > > > app/test-acl/main.c | 13 +-
> > > > > lib/librte_acl/Makefile | 5 +-
> > > > > lib/librte_acl/acl_bld.c | 5 +-
> > > > > lib/librte_acl/acl_match_check.def | 92 ++++
> > > > > lib/librte_acl/acl_run.c | 944 -------------------------------------
> > > > > lib/librte_acl/acl_run.h | 220 +++++++++
> > > > > lib/librte_acl/acl_run_scalar.c | 197 ++++++++
> > > > > lib/librte_acl/acl_run_sse.c | 630 +++++++++++++++++++++++++
> > > > > lib/librte_acl/rte_acl.c | 15 +
> > > > > lib/librte_acl/rte_acl.h | 24 +-
> > > > > 10 files changed, 1189 insertions(+), 956 deletions(-)
> > > > > create mode 100644 lib/librte_acl/acl_match_check.def
> > > > > delete mode 100644 lib/librte_acl/acl_run.c
> > > > > create mode 100644 lib/librte_acl/acl_run.h
> > > > > create mode 100644 lib/librte_acl/acl_run_scalar.c
> > > > > create mode 100644 lib/librte_acl/acl_run_sse.c
> > > > >
> > > > > diff --git a/app/test-acl/main.c b/app/test-acl/main.c
> > > > > index d654409..45c6fa6 100644
> > > > > --- a/app/test-acl/main.c
> > > > > +++ b/app/test-acl/main.c
> > > > > @@ -787,6 +787,10 @@ acx_init(void)
> > > > > /* perform build. */
> > > > > ret = rte_acl_build(config.acx, &cfg);
> > > > >
> > > > > + /* setup default rte_acl_classify */
> > > > > + if (config.scalar)
> > > > > + rte_acl_default_classify = rte_acl_classify_scalar;
> > > > > +
> > > > Exporting this variable as part of the ABI is a bad idea. If the prototype of
> > > > the function changes you have to update all your applications.
> > >
> > > If the prototype of rte_acl_classify will change, most likely you'll have to update code that uses it anyway.
> > >
> > Why? If you hide this from the application, changes to the internal
> > implementation will also be invisible. When building as a DSO, an application
> > will be able to transition between libraries without the need for a rebuild.
> Because rte_acl_classify() is part of the ACL API that users use.
> If we'll add/modify its parameters and/or return value - users would have to change their apps anyway.
Thats not at all true. With API versioning scripts you can make several
versions of the same function with different prototypes as future needs dictate.
Hiding the internal implementation just makes that easier.
> > > > Make the pointer
> > > > an internal symbol and set it using a get/set routine with an enum to represent
> > > > the path to choose. That will help isolate the ABI from the internal
> > > > implementation.
> > >
> > > That's was my first intention too.
> > > But then I realised that if we'll make it internal, then we'll need to make rte_acl_classify() a proper function
> > > and it will cost us extra call (or jump).
> > Thats true, but I don't see that as a problem. We're not talking about a hot
> > code path here, its a setup function.
> I am talking not about rte_acl_select_classify() but about rte_acl_classify() itself (not code path).
> If I'll make rte_acl_default_classify statitc, the rte_acl_classiy() would need to become a real function and it'll be something like that:
> ->call rte_acl_acl_classify
> ---> load rte_acl_calssify_default value into the reg
> ---> jmp (*reg)
Ah, yes, the actual classification path, you will need an extra call
instruction there. I would say if thats the case, then you should either make
rte_acl_classify a macro or real function based on weather your building as a
shared library or a static library.
> > Or do you think that an application will
> > be switching between classification functions on every classify operation?
> God no.
> > > Also I think user should have an ability to change default classify code path without modifying/rebuilding acl library.
> > I agree, but both the methods we are advocating for allow that. Its really just
> > a question of exposing the mechanism as data or text in the binary. Exposing it
> > as data comes with implicit ABI constraints that are less prevalanet when done
> > as code entry points.
> > > For example: a bug in an optimised code path is discovered, or user may want to implement and use his own version of classify().
> > In the case of a bug in the optimized path, you just fix the bug.
> It is not about me. It is about a user who get librte_acl as part of binary distribution.
Yes, those are my users :)
> Of course, he probably will report about it and we probably fix it sooner or later.
> But with such ability he can switch to the safe implementation immediately
> without touching the library and then wait for the fix.
Thats not how users of a binary pacakge from a distribution operate. If their
using a binary package they either:
1) Don't want to rebuild anything themselves, in which case they file the bug,
and wait for the developers to fix the issue.
2) Have a staff to help them work around the issue, which will be done by
rebuilding/fixing the library, not the application.
With (2), what I am saying is that, if a 3rd party finds a bug in the classifier
code within dpdk which is built as a shared library within a distribution, and
they need it fixed immediately, they have a choice of what to do, they can
either (a), write a custom classifier function and point the dpdk library to it,
or (b), just fix the bug in the library directly. Given that, if they can
accomplish (a), they by all rights can also accompilsh (b), the only decision
they need to make is one which makes the most sense for them. The answer is
(b), because thats where the functionality lives. i.e. when the fix occurs
upstream and a new release gets issued, you can go back to using the library
maintained version, and you don't have to clean up what has become vestigial
> > If you want
> > to provide your own classification function, thats fine I suppose, but that
> > seems completely outside the scope of what we're trying to do here. Its not
> > adventageous to just throw that in there. If you want to be able to provide
> > your own classifier function, lets at least take some time to make sure that the
> > function prototype is sufficiently capable to accept all the data you might want
> > to pass it in the future, before we go exposing it. Otherwise you'll have to
> > break the ABI in future versions, whcih is something we've been discussing
> > trying to avoid.
> rte_acl_classify() it is already exposed (PART of API), same as rte_acl_classify_scalar().
> If in future, we'll change these functions prototypes will break ABI anyway.
Well, at the moment, thats fine because you don't make any ABI promises anyway,
I've been working to change that, so distributions can have greater dpdk
> > > > It will also let you prevent things like selecting a run time
> > > > path that is incompatible with the running system
> > >
> > > If the user going to update rte_acl_default_classify he is probably smart enough to know what he is doing.
> > That really seems like poor design to me. I don't see why you wouldn't at least
> > want to warn the developer of an application if they were at run time to assign
> > a default classifier method that was incompatible with a running system. Yes,
> > they're likely smart enough to know what their doing, but smart people make
> > mistakes, and appreciate being told when they're doing so, especially if the
> > method of telling is something a bit more civil than a machine check that
> > might occur well after the application has been initilized.
> I have no problem providing rte_acl_check_classify(flags_required, classify_ptr) that would do checking and emit the warning.
> Though as I said above, I'll prefer not to hide rte_acl_default_classify it will cause extra overhead for rte_acl_classify().
> > > From other hand - user can hit same problem by simply calling rte_acl_classify_sse() directly.
> > Not if the function is statically declared and not exposed to the application
> > they cant :)
> I don't really want to hide rte_acl_classify_sse/rte_acl_classify_scalar().
> Should be available directly I think.
> In future we might introduce new versions for more sophisticated ISAs (rte_acl_classify_avx() or something).
> Users should have an ability to downgrade their classify() function if they like.
What in your mind is the reasoning behind being able to do so? What is
adventageous about that? Asside possibly from debugging that is (for which I
can see a use). But in normal production operation, why would you choose to not
use the sse classifier over the scalar classifier?
More information about the dev