[dpdk-dev] [PATCH v2] net/mlx5: add support for 32bit systems

Ferruh Yigit ferruh.yigit at intel.com
Thu Jul 5 19:49:52 CEST 2018


On 7/5/2018 6:07 PM, Mordechay Haimovsky wrote:
> Hello Ferruh,
>   Here are my findings:
> 
> 1.  The error you've seen is definitely a bug in mlx5dv.h from rdma-core
>       (I'm emphasizing rdma-core since I cannot just send a fix for this file)
>       As it didn’t take into account that an address may be a 32bit one when performing the 32bit shift.
>       __m128i val  = _mm_set_epi32((uint32_t)address, (uint32_t)(address >> 32), lkey, length);
> 2. The reason we didn’t see it in our setups is due to the values assigned to the GCC predefined macros
>     We are using (from RH and UBUNTU).
>     When I run the following commands in our setups:
> 	alias gccmacros='gcc -dM -E -x c /dev/null'
> 	gccmacros -m32 | grep -E "(MMX|SSE|AVX|XOP)"
>     I get the following results:
>         On RH setup using gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC)
> 	#define __MMX__ 1
> 	#define __SSE2__ 1
> 	#define __SSE__ 1
>       On Ubuntu setup using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)
> 	No flags are defined.
>    Since the "offending" routine is wrapped with #ifdef __SSE3__ the compiler just ignores it.
> 
> ARs:
>   1. Open a bug for fixing mlx5dv.h in rdma-core. - Moti H.
>   2. Provide a workaround for the problem. - Moti H.
>   3. Verify that this is actually the issue by running the above scripts
>        In Ferruh setup and verifying  the SSE3 flag is set. - Ferruh Yigit

I confirm SSE3 is set in my environment, but I think this will be true for all
x86 because DPDK min required SIMD is SSE4.2. According wiki SSE3 introduced in
2004.

We use -march=native in dpdk build, so:
$ gcc -march=native -m32 -dM -E - </dev/null | grep SSE3
#define __SSSE3__ 1
#define __SSE3__ 1


> 
> Moti H. 
> 
>> -----Original Message-----
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mordechay
>> Haimovsky
>> Sent: Thursday, July 5, 2018 1:10 PM
>> To: Ferruh Yigit <ferruh.yigit at intel.com>; Shahaf Shuler
>> <shahafs at mellanox.com>
>> Cc: Adrien Mazarguil <adrien.mazarguil at 6wind.com>; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v2] net/mlx5: add support for 32bit systems
>>
>> Hi,
>>  Didn’t see it in our setups (not an excuse),  Investigating ....
>>
>> Moti
>>
>>> -----Original Message-----
>>> From: Ferruh Yigit [mailto:ferruh.yigit at intel.com]
>>> Sent: Wednesday, July 4, 2018 4:49 PM
>>> To: Mordechay Haimovsky <motih at mellanox.com>; Shahaf Shuler
>>> <shahafs at mellanox.com>
>>> Cc: Adrien Mazarguil <adrien.mazarguil at 6wind.com>; dev at dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH v2] net/mlx5: add support for 32bit
>>> systems
>>>
>>> On 7/2/2018 12:11 PM, Moti Haimovsky wrote:
>>>> This patch adds support for building and running mlx5 PMD on 32bit
>>>> systems such as i686.
>>>>
>>>> The main issue to tackle was handling the 32bit access to the UAR as
>>>> quoted from the mlx5 PRM:
>>>> QP and CQ DoorBells require 64-bit writes. For best performance, it
>>>> is recommended to execute the QP/CQ DoorBell as a single 64-bit
>>>> write operation. For platforms that do not support 64 bit writes, it
>>>> is possible to issue the 64 bits DoorBells through two consecutive
>>>> writes, each write 32 bits, as described below:
>>>> * The order of writing each of the Dwords is from lower to upper
>>>>   addresses.
>>>> * No other DoorBell can be rung (or even start ringing) in the midst of
>>>>   an on-going write of a DoorBell over a given UAR page.
>>>> The last rule implies that in a multi-threaded environment, the
>>>> access to a UAR page (which can be accessible by all threads in the
>>>> process) must be synchronized (for example, using a semaphore)
>>>> unless an atomic write of 64 bits in a single bus operation is
>>>> guaranteed. Such a synchronization is not required for when ringing
>>>> DoorBells on different UAR pages.
>>>>
>>>> Signed-off-by: Moti Haimovsky <motih at mellanox.com>
>>>> ---
>>>> v2:
>>>> * Fixed coding style issues.
>>>> * Modified documentation according to review inputs.
>>>> * Fixed merge conflicts.
>>>> ---
>>>>  doc/guides/nics/features/mlx5.ini |  1 +
>>>>  doc/guides/nics/mlx5.rst          |  6 +++-
>>>>  drivers/net/mlx5/mlx5.c           |  8 ++++-
>>>>  drivers/net/mlx5/mlx5.h           |  5 +++
>>>>  drivers/net/mlx5/mlx5_defs.h      | 18 ++++++++--
>>>>  drivers/net/mlx5/mlx5_rxq.c       |  6 +++-
>>>>  drivers/net/mlx5/mlx5_rxtx.c      | 22 +++++++------
>>>>  drivers/net/mlx5/mlx5_rxtx.h      | 69
>>> ++++++++++++++++++++++++++++++++++++++-
>>>>  drivers/net/mlx5/mlx5_txq.c       | 13 +++++++-
>>>>  9 files changed, 131 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/doc/guides/nics/features/mlx5.ini
>>>> b/doc/guides/nics/features/mlx5.ini
>>>> index e75b14b..b28b43e 100644
>>>> --- a/doc/guides/nics/features/mlx5.ini
>>>> +++ b/doc/guides/nics/features/mlx5.ini
>>>> @@ -43,5 +43,6 @@ Multiprocess aware   = Y
>>>>  Other kdrv           = Y
>>>>  ARMv8                = Y
>>>>  Power8               = Y
>>>> +x86-32               = Y
>>>>  x86-64               = Y
>>>>  Usage doc            = Y
>>>> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
>>>> index
>>>> 7dd9c1c..5fbad60 100644
>>>> --- a/doc/guides/nics/mlx5.rst
>>>> +++ b/doc/guides/nics/mlx5.rst
>>>> @@ -49,7 +49,7 @@ libibverbs.
>>>>  Features
>>>>  --------
>>>>
>>>> -- Multi arch support: x86_64, POWER8, ARMv8.
>>>> +- Multi arch support: x86_64, POWER8, ARMv8, i686.
>>>>  - Multiple TX and RX queues.
>>>>  - Support for scattered TX and RX frames.
>>>>  - IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of
>>> queues.
>>>> @@ -477,6 +477,10 @@ RMDA Core with Linux Kernel
>>>>  - Minimal kernel version : v4.14 or the most recent 4.14-rc (see
>>>> `Linux installation documentation`_)
>>>>  - Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull
>>> request #227 from yishaih/tm")
>>>>    (see `RDMA Core installation documentation`_)
>>>> +- When building for i686 use:
>>>> +
>>>> +  - rdma-core version 18.0 or above built with 32bit support.
>>>
>>> related "or above" part, v19 giving build errors with mlx5, FYI.
>>>
>>> And with v18 getting build errors originated from rdma headers [1], am
>>> I doing something wrong?
>>>
>>> [1]
>>> In file included from .../dpdk/drivers/net/mlx5/mlx5_glue.c:20:
>>> .../rdma-core/build32/include/infiniband/mlx5dv.h: In function
>>> ‘mlx5dv_x86_set_data_seg’:
>>> .../rdma-core/build32/include/infiniband/mlx5dv.h:787:69: error: right
>>> shift count >= width of type [-Werror=shift-count-overflow]
>>>   __m128i val  = _mm_set_epi32((uint32_t)address, (uint32_t)(address
>>>>> 32), lkey, length);
>>>
>>> ^~



More information about the dev mailing list