[dpdk-dev] [PATCH v3 17/18] net: add checks for max SIMD bitwidth

Power, Ciara ciara.power at intel.com
Thu Oct 1 16:19:37 CEST 2020


Hi David,

Thanks for reviewing, 

>-----Original Message-----
>From: Coyle, David <david.coyle at intel.com>
>Sent: Thursday 1 October 2020 15:17
>To: Singh, Jasvinder <jasvinder.singh at intel.com>; Power, Ciara
><ciara.power at intel.com>; dev at dpdk.org
>Cc: Power, Ciara <ciara.power at intel.com>; Olivier Matz
><olivier.matz at 6wind.com>; O'loingsigh, Mairtin
><mairtin.oloingsigh at intel.com>; Ryan, Brendan <brendan.ryan at intel.com>;
>Richardson, Bruce <bruce.richardson at intel.com>
>Subject: RE: [dpdk-dev] [PATCH v3 17/18] net: add checks for max SIMD
>bitwidth
>
>Hi Jasvinder/Ciara
>
>> -----Original Message-----
>> From: Singh, Jasvinder <jasvinder.singh at intel.com>
>> >
>> > > From: dev <dev-bounces at dpdk.org> On Behalf Of Ciara Power When
>> > > choosing a vector path to take, an extra condition must be
>> > > satisfied to ensure the max SIMD bitwidth allows for the CPU enabled
>path.
>> > >
>> > > The vector path was initially chosen in RTE_INIT, however this is
>> > > no longer suitable as we cannot check the max SIMD bitwidth at that
>time.
>> > > The default chosen in RTE_INIT is now scalar. For best performance
>> > > and to use vector paths, apps must explicitly call the set
>> > > algorithm function before using other functions from this library,
>> > > as this is where vector handlers are now chosen.
>> >
>> > [DC] Has it been decided that it is ok to now require applications
>> > to pick the CRC algorithm they want to use?
>> >
>> > An application which previously automatically got SSE4.2 CRC, for
>> > example, will now automatically only get scalar.
>> >
>> > If this is ok, this should probably be called out explicitly in
>> > release notes as it may not be Immediately noticeable to users that
>> > they now need to select the CRC algo.
>> >
>> > Actually, in general, the release notes need to be updated for this
>> patchset.
>>
>> The decision to move rte_set_alg() out of RTE_INIT was taken to avoid
>> check on max_simd_bitwidth in data path for every single time when
>> crc_calc() api is invoked. Based on my understanding,
>> max_simd_bitwidth is set after eal init, and when used in crc_calc(),
>> it might override the default crc algo set during RTE_INIT. Therefore,
>> to avoid extra check on max_simd_bitwidth in data path,  better option
>> will be to use this static configuration one time after eal init in the set_algo
>API.
>
>[DC] Yes that is a good change to have made to avoid extra datapath checks.
>
>Based on off-list discussion, I now also know the reason behind now
>defaulting to scalar CRC in RTE_INIT. If a higher bitwidth CRC was chosen by
>RTE_INIT (e.g.
>SSE4.2 CRC) but the max_simd_bitwidth was then set to RTE_NO_SIMD (64)
>through the EAL parameter or call to rte_set_max_simd_bitwidth(), then
>there is a mismatch if rte_net_crc_set_alg() is not then called to reconfigure
>the CRC. Defaulting to scalar avoids this mismatch and works on all archs
>
>As I mentioned before, I think this needs to be called out in release notes, as
>it's an under-the-hood change which could cause app performance to drop if
>app developers aren't aware of it - the API itself hasn't changed, so they may
>not read the doxygen :)
>

Yes that is a good point, I can add to the release notes for this to call it out. 

>>
>>
>> > >
>> > > Suggested-by: Jasvinder Singh <jasvinder.singh at intel.com>
>> > >
>> > > Signed-off-by: Ciara Power <ciara.power at intel.com>
>> > >
>> > > ---
>> > > v3:
>> > >   - Moved choosing vector paths out of RTE_INIT.
>> > >   - Moved checking max_simd_bitwidth into the set_alg function.
>> > > ---
>> > >  lib/librte_net/rte_net_crc.c | 26 +++++++++++++++++---------
>> > > lib/librte_net/rte_net_crc.h |  3 ++-
>> > >  2 files changed, 19 insertions(+), 10 deletions(-)
>> > >
>> > > diff --git a/lib/librte_net/rte_net_crc.c
>> > > b/lib/librte_net/rte_net_crc.c index
>> > > 9fd4794a9d..241eb16399 100644
>> > > --- a/lib/librte_net/rte_net_crc.c
>> > > +++ b/lib/librte_net/rte_net_crc.c
>> >
>> > <snip>
>> >
>> > > @@ -145,18 +149,26 @@ rte_crc32_eth_handler(const uint8_t *data,
>> > > uint32_t data_len)  void  rte_net_crc_set_alg(enum rte_net_crc_alg
>> > > alg)  {
>> > > +	if (max_simd_bitwidth == 0)
>> > > +		max_simd_bitwidth = rte_get_max_simd_bitwidth();
>> > > +
>> > >  	switch (alg) {
>> > >  #ifdef X86_64_SSE42_PCLMULQDQ
>> > >  	case RTE_NET_CRC_SSE42:
>> > > -		handlers = handlers_sse42;
>> > > -		break;
>> > > +		if (max_simd_bitwidth >= RTE_MAX_128_SIMD) {
>> > > +			handlers = handlers_sse42;
>> > > +			return;
>> > > +		}
>> > > +		RTE_LOG(INFO, NET, "Max SIMD Bitwidth too low, using
>> > > scalar\n");
>> >
>> > [DC] Not sure if you're aware but there is another patchset which
>> > adds an
>> > AVX512 CRC implementation and run-time checking of cpuflags to
>> > select the CRC path to use:
>> > https://patchwork.dpdk.org/project/dpdk/list/?series=12596
>> >
>> > There will be a task to merge these 2 patchsets if both are merged.
>> > It looks fairly straightforward to me to merge these, but it would
>> > be good if you take a look too

I have looked at that patchset, I agree, I think they will be straightforward to merge together.

Thanks,
Ciara


More information about the dev mailing list