[dpdk-dev] [PATCH v3 00/18] ACL: New AVX2 classify method and several other enhancements.
Thomas Monjalon
thomas.monjalon at 6wind.com
Wed Jan 28 17:14:33 CET 2015
> > v3 changes:
> > Applied review comments from Thomas:
> > - fix spelling errors reported by codespell.
> > - split last patch into two:
> > first to remove unused macros,
> > second to add some comments about ACL internal layout.
> >
> > v2 changes:
> > - When build with the compilers that don't support AVX2 instructions,
> > make rte_acl_classify_avx2() do nothing and return an error.
> > - Remove unneeded 'ifdef __AVX2__' in acl_run_avx2.*.
> > - Reorder order of patches in the set, to keep RTE_LIBRTE_ACL_STANDALONE=y
> > always buildable.
> >
> > This patch series contain several fixes and enhancements for ACL library.
> > See complete list below.
> > Two main changes that are externally visible:
> > - Introduce new classify method: RTE_ACL_CLASSIFY_AVX2.
> > It uses AVX2 instructions and 256 bit wide data types
> > to perform internal trie traversal.
> > That helps to increase classify() throughput.
> > This method is selected as default one on CPUs that supports AVX2.
> > - Introduce new field in the build config structure: max_size.
> > It specifies maximum size that internal RT structure for given context
> > can reach.
> > The purpose of that is to allow user to decide about space/performance trade-off
> > (faster classify() vs less space for RT internal structures)
> > for each given set of rules.
> >
> > Konstantin Ananyev (18):
> > fix fix compilation issues with RTE_LIBRTE_ACL_STANDALONE=y
> > app/test: few small fixes fot test_acl.c
> > librte_acl: make data_indexes long enough to survive idle transitions.
> > librte_acl: remove build phase heuristsic with negative performance
> > effect.
> > librte_acl: fix a bug at build phase that can cause matches beeing
> > overwirtten.
> > librte_acl: introduce DFA nodes compression (group64) for identical
> > entries.
> > librte_acl: build/gen phase - simplify the way match nodes are
> > allocated.
> > librte_acl: make scalar RT code to be more similar to vector one.
> > librte_acl: a bit of RT code deduplication.
> > EAL: introduce rte_ymm and relatives in rte_common_vect.h.
> > librte_acl: add AVX2 as new rte_acl_classify() method
> > test-acl: add ability to manually select RT method.
> > librte_acl: Remove search_sse_2 and relatives.
> > libter_acl: move lo/hi dwords shuffle out from calc_addr
> > libte_acl: make calc_addr a define to deduplicate the code.
> > libte_acl: introduce max_size into rte_acl_config.
> > libte_acl: remove unused macros.
> > libte_acl: add some comments about ACL internal layout.
> >
> For the series
> Acked-by: Neil Horman <nhorman at tuxdriver.com>
Applied
Thanks for the big work
--
Thomas
More information about the dev
mailing list