Hello Ferruh,
  Here are my findings:

1.  The error you've seen is definitely a bug in mlx5dv.h from rdma-core
      (I'm emphasizing rdma-core since I cannot just send a fix for this file)
      As it didn’t take into account that an address may be a 32bit one when 
performing the 32bit shift.
      __m128i val  = _mm_set_epi32((uint32_t)address, (uint32_t)(address >> 
32), lkey, length);
2. The reason we didn’t see it in our setups is due to the values assigned to 
the GCC predefined macros
    We are using (from RH and UBUNTU).
    When I run the following commands in our setups:
        alias gccmacros='gcc -dM -E -x c /dev/null'
        gccmacros -m32 | grep -E "(MMX|SSE|AVX|XOP)"
    I get the following results:
        On RH setup using gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC)
        #define __MMX__ 1
        #define __SSE2__ 1
        #define __SSE__ 1
      On Ubuntu setup using gcc version 5.4.0 20160609 (Ubuntu 
5.4.0-6ubuntu1~16.04.10)
        No flags are defined.
   Since the "offending" routine is wrapped with #ifdef __SSE3__ the compiler 
just ignores it.

ARs:
  1. Open a bug for fixing mlx5dv.h in rdma-core. - Moti H.
  2. Provide a workaround for the problem. - Moti H.
  3. Verify that this is actually the issue by running the above scripts
       In Ferruh setup and verifying  the SSE3 flag is set. - Ferruh Yigit

Moti H. 

> -----Original Message-----
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Mordechay
> Haimovsky
> Sent: Thursday, July 5, 2018 1:10 PM
> To: Ferruh Yigit <ferruh.yi...@intel.com>; Shahaf Shuler
> <shah...@mellanox.com>
> Cc: Adrien Mazarguil <adrien.mazarg...@6wind.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] net/mlx5: add support for 32bit systems
> 
> Hi,
>  Didn’t see it in our setups (not an excuse),  Investigating ....
> 
> Moti
> 
> > -----Original Message-----
> > From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> > Sent: Wednesday, July 4, 2018 4:49 PM
> > To: Mordechay Haimovsky <mo...@mellanox.com>; Shahaf Shuler
> > <shah...@mellanox.com>
> > Cc: Adrien Mazarguil <adrien.mazarg...@6wind.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] net/mlx5: add support for 32bit
> > systems
> >
> > On 7/2/2018 12:11 PM, Moti Haimovsky wrote:
> > > This patch adds support for building and running mlx5 PMD on 32bit
> > > systems such as i686.
> > >
> > > The main issue to tackle was handling the 32bit access to the UAR as
> > > quoted from the mlx5 PRM:
> > > QP and CQ DoorBells require 64-bit writes. For best performance, it
> > > is recommended to execute the QP/CQ DoorBell as a single 64-bit
> > > write operation. For platforms that do not support 64 bit writes, it
> > > is possible to issue the 64 bits DoorBells through two consecutive
> > > writes, each write 32 bits, as described below:
> > > * The order of writing each of the Dwords is from lower to upper
> > >   addresses.
> > > * No other DoorBell can be rung (or even start ringing) in the midst of
> > >   an on-going write of a DoorBell over a given UAR page.
> > > The last rule implies that in a multi-threaded environment, the
> > > access to a UAR page (which can be accessible by all threads in the
> > > process) must be synchronized (for example, using a semaphore)
> > > unless an atomic write of 64 bits in a single bus operation is
> > > guaranteed. Such a synchronization is not required for when ringing
> > > DoorBells on different UAR pages.
> > >
> > > Signed-off-by: Moti Haimovsky <mo...@mellanox.com>
> > > ---
> > > v2:
> > > * Fixed coding style issues.
> > > * Modified documentation according to review inputs.
> > > * Fixed merge conflicts.
> > > ---
> > >  doc/guides/nics/features/mlx5.ini |  1 +
> > >  doc/guides/nics/mlx5.rst          |  6 +++-
> > >  drivers/net/mlx5/mlx5.c           |  8 ++++-
> > >  drivers/net/mlx5/mlx5.h           |  5 +++
> > >  drivers/net/mlx5/mlx5_defs.h      | 18 ++++++++--
> > >  drivers/net/mlx5/mlx5_rxq.c       |  6 +++-
> > >  drivers/net/mlx5/mlx5_rxtx.c      | 22 +++++++------
> > >  drivers/net/mlx5/mlx5_rxtx.h      | 69
> > ++++++++++++++++++++++++++++++++++++++-
> > >  drivers/net/mlx5/mlx5_txq.c       | 13 +++++++-
> > >  9 files changed, 131 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/doc/guides/nics/features/mlx5.ini
> > > b/doc/guides/nics/features/mlx5.ini
> > > index e75b14b..b28b43e 100644
> > > --- a/doc/guides/nics/features/mlx5.ini
> > > +++ b/doc/guides/nics/features/mlx5.ini
> > > @@ -43,5 +43,6 @@ Multiprocess aware   = Y
> > >  Other kdrv           = Y
> > >  ARMv8                = Y
> > >  Power8               = Y
> > > +x86-32               = Y
> > >  x86-64               = Y
> > >  Usage doc            = Y
> > > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> > > index
> > > 7dd9c1c..5fbad60 100644
> > > --- a/doc/guides/nics/mlx5.rst
> > > +++ b/doc/guides/nics/mlx5.rst
> > > @@ -49,7 +49,7 @@ libibverbs.
> > >  Features
> > >  --------
> > >
> > > -- Multi arch support: x86_64, POWER8, ARMv8.
> > > +- Multi arch support: x86_64, POWER8, ARMv8, i686.
> > >  - Multiple TX and RX queues.
> > >  - Support for scattered TX and RX frames.
> > >  - IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of
> > queues.
> > > @@ -477,6 +477,10 @@ RMDA Core with Linux Kernel
> > >  - Minimal kernel version : v4.14 or the most recent 4.14-rc (see
> > > `Linux installation documentation`_)
> > >  - Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull
> > request #227 from yishaih/tm")
> > >    (see `RDMA Core installation documentation`_)
> > > +- When building for i686 use:
> > > +
> > > +  - rdma-core version 18.0 or above built with 32bit support.
> >
> > related "or above" part, v19 giving build errors with mlx5, FYI.
> >
> > And with v18 getting build errors originated from rdma headers [1], am
> > I doing something wrong?
> >
> > [1]
> > In file included from .../dpdk/drivers/net/mlx5/mlx5_glue.c:20:
> > .../rdma-core/build32/include/infiniband/mlx5dv.h: In function
> > ‘mlx5dv_x86_set_data_seg’:
> > .../rdma-core/build32/include/infiniband/mlx5dv.h:787:69: error: right
> > shift count >= width of type [-Werror=shift-count-overflow]
> >   __m128i val  = _mm_set_epi32((uint32_t)address, (uint32_t)(address
> > >> 32), lkey, length);
> >
> > ^~

Reply via email to