Inline
> -----Original Message----- > From: Shahaf Shuler > Sent: Monday, July 2, 2018 10:05 AM > To: Mordechay Haimovsky <mo...@mellanox.com>; Yongseok Koh > <ys...@mellanox.com>; Adrien Mazarguil <adrien.mazarg...@6wind.com> > Cc: dev@dpdk.org; Mordechay Haimovsky <mo...@mellanox.com> > Subject: RE: [dpdk-dev] [PATCH] net/mlx5: add support for 32bit systems > > Hi Moty, > > Few nits, > > Also please fix the check patch warning : > ### net/mlx5: add support for 32bit systems > > CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' > #235: FILE: drivers/net/mlx5/mlx5_rxtx.c:1591: > + addr_64 = rte_cpu_to_be_64( > > total: 0 errors, 0 warnings, 1 checks, 311 lines checked > > Will Do. Is there a way to know against which kernel DPDK tools are testing ? Checkpatch does not show this error when testing with 4.14.0-0.rc4.git4.1.el7.x86_64 kernel for example. > > Thursday, June 28, 2018 10:13 AM, Moti Haimovsky: > > Subject: [dpdk-dev] [PATCH] net/mlx5: add support for 32bit systems > > > > This patch adds support for building and running mlx5 PMD on 32bit > > systems such as i686. > > > > The main issue to tackle was handling the 32bit access to the UAR as > > quoted from the mlx5 PRM: > > QP and CQ DoorBells require 64-bit writes. For best performance, it is > > recommended to execute the QP/CQ DoorBell as a single 64-bit write > > operation. For platforms that do not support 64 bit writes, it is > > possible to issue the 64 bits DoorBells through two consecutive > > writes, each write 32 bits, as described below: > > * The order of writing each of the Dwords is from lower to upper > > addresses. > > * No other DoorBell can be rung (or even start ringing) in the midst of > > an on-going write of a DoorBell over a given UAR page. > > The last rule implies that in a multi-threaded environment, the access > > to a UAR page (which can be accessible by all threads in the process) > > must be synchronized (for example, using a semaphore) unless an atomic > > write of 64 bits in a single bus operation is guaranteed. Such a > > synchronization is not required for when ringing DoorBells on different UAR > pages. > > > > Signed-off-by: Moti Haimovsky <mo...@mellanox.com> > > --- > > doc/guides/nics/features/mlx5.ini | 1 + > > doc/guides/nics/mlx5.rst | 11 +++++++ > > drivers/net/mlx5/mlx5.c | 8 ++++- > > drivers/net/mlx5/mlx5.h | 5 +++ > > drivers/net/mlx5/mlx5_defs.h | 18 ++++++++-- > > drivers/net/mlx5/mlx5_rxq.c | 6 +++- > > drivers/net/mlx5/mlx5_rxtx.c | 22 +++++++------ > > drivers/net/mlx5/mlx5_rxtx.h | 69 > > ++++++++++++++++++++++++++++++++++++++- > > drivers/net/mlx5/mlx5_txq.c | 13 +++++++- > > 9 files changed, 137 insertions(+), 16 deletions(-) > > > > diff --git a/doc/guides/nics/features/mlx5.ini > > b/doc/guides/nics/features/mlx5.ini > > index e75b14b..b28b43e 100644 > > --- a/doc/guides/nics/features/mlx5.ini > > +++ b/doc/guides/nics/features/mlx5.ini > > @@ -43,5 +43,6 @@ Multiprocess aware = Y > > Other kdrv = Y > > ARMv8 = Y > > Power8 = Y > > +x86-32 = Y > > x86-64 = Y > > Usage doc = Y > > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index > > 7dd9c1c..cb9d5d8 100644 > > --- a/doc/guides/nics/mlx5.rst > > +++ b/doc/guides/nics/mlx5.rst > > @@ -50,6 +50,8 @@ Features > > -------- > > > > - Multi arch support: x86_64, POWER8, ARMv8. > > +- Support for i686 is available only when working with > > + rdma-core version 18.0 or above, built with 32bit support. > > I think we can just add i686 to the supported arch. The limitation on the > rdma-core version is well documented below. > Will change this > > - Multiple TX and RX queues. > > - Support for scattered TX and RX frames. > > - IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of > queues. > > @@ -136,6 +138,11 @@ Limitations > > enabled (``rxq_cqe_comp_en``) at the same time, RSS hash result is > > not fully > > supported. Some Rx packets may not have PKT_RX_RSS_HASH. > > > > +- Building for i686 is only supported with: > > + > > + - rdma-core version 18.0 or above built with 32bit support. > > + - Kernel version 4.14.41 or above. > > Why the kernel is related? The rdma-core I understand. > There was a patch added to the kernel that fixed broken 32bit support f2e9bfac13c9 RDMA/rxe: Fix uABI structure layouts for 32/64 compat (SHA may have changed) This patch was added to kernel 4.17 and backported to 4.14.41 > > + > > Statistics > > ---------- > > > > @@ -477,6 +484,10 @@ RMDA Core with Linux Kernel > > - Minimal kernel version : v4.14 or the most recent 4.14-rc (see > > `Linux installation documentation`_) > > - Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull > > request #227 from yishaih/tm") > > (see `RDMA Core installation documentation`_) > > +- When building for i686 use: > > + > > + - rdma-core version 18.0 or above built with 32bit support. > > + - Kernel version 4.14.41 or above. > > > > .. _`Linux installation documentation`: > > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. > > kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fstable%2Flinux- > > stable.git%2Fplain%2FDocumentation%2Fadmin- > > > guide%2FREADME.rst&data=02%7C01%7Cshahafs%40mellanox.com%7C3793 > > > 359a175d46b47c2508d5dcc69ff1%7Ca652971c7d2e4d9ba6a4d149256f461b%7 > > > C0%7C0%7C636657668016130861&sdata=yFHd7tQET5SqIcPgj66BSuwJp3sydo > > ujC0ldCMkChVE%3D&reserved=0 > > .. _`RDMA Core installation documentation`: > > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fra > > w.githubusercontent.com%2Flinux-rdma%2Frdma- > > > core%2Fmaster%2FREADME.md&data=02%7C01%7Cshahafs%40mellanox.co > > > m%7C3793359a175d46b47c2508d5dcc69ff1%7Ca652971c7d2e4d9ba6a4d1492 > > > 56f461b%7C0%7C0%7C636657668016130861&sdata=4LNh%2Fr5vM4BJeizvEIxi > > ShMrfcx0NrlBFWz4V2wA%2FkY%3D&reserved=0 > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index > > f0e6ed7..5d0f706 100644 > > --- a/drivers/net/mlx5/mlx5.c > > +++ b/drivers/net/mlx5/mlx5.c > > @@ -567,7 +567,7 @@ > > rte_memseg_walk(find_lower_va_bound, &addr); > > > > /* keep distance to hugepages to minimize potential conflicts. */ > > - addr = RTE_PTR_SUB(addr, MLX5_UAR_OFFSET + MLX5_UAR_SIZE); > > + addr = RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET + > > +MLX5_UAR_SIZE)); > > /* anonymous mmap, no real memory consumption. */ > > addr = mmap(addr, MLX5_UAR_SIZE, > > PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > @@ -953,6 > > +953,12 @@ > > priv->port = port; > > priv->pd = pd; > > priv->mtu = ETHER_MTU; > > +#ifndef RTE_ARCH_64 > > + /* Initialize UAR access locks for 32bit implementations. */ > > + rte_spinlock_init(&priv->uar_lock_cq); > > + for (i = 0; i < MLX5_UAR_PAGE_NUM_MAX; i++) > > + rte_spinlock_init(&priv->uar_lock[i]); > > +#endif > > err = mlx5_args(&config, pci_dev->device.devargs); > > if (err) { > > err = rte_errno; > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index > > 997b04a..2da32cd 100644 > > --- a/drivers/net/mlx5/mlx5.h > > +++ b/drivers/net/mlx5/mlx5.h > > @@ -198,6 +198,11 @@ struct priv { > > /* Context for Verbs allocator. */ > > int nl_socket; /* Netlink socket. */ > > uint32_t nl_sn; /* Netlink message sequence number. */ > > +#ifndef RTE_ARCH_64 > > + rte_spinlock_t uar_lock_cq; /* CQs share a common distinct UAR */ > > + rte_spinlock_t uar_lock[MLX5_UAR_PAGE_NUM_MAX]; > > + /* UAR same-page access control required in 32bit implementations. > > */ > > +#endif > > }; > > > > #define PORT_ID(priv) ((priv)->dev_data->port_id) diff --git > > a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index > > 5bbbec2..f6ec415 100644 > > --- a/drivers/net/mlx5/mlx5_defs.h > > +++ b/drivers/net/mlx5/mlx5_defs.h > > @@ -87,14 +87,28 @@ > > #define MLX5_LINK_STATUS_TIMEOUT 10 > > > > /* Reserved address space for UAR mapping. */ -#define MLX5_UAR_SIZE > > (1ULL << 32) > > +#define MLX5_UAR_SIZE (1ULL << (sizeof(uintptr_t) * 4)) > > > > /* Offset of reserved UAR address space to hugepage memory. Offset is > > used here > > * to minimize possibility of address next to hugepage being used by > > other code > > * in either primary or secondary process, failing to map TX UAR > > would make TX > > * packets invisible to HW. > > */ > > -#define MLX5_UAR_OFFSET (1ULL << 32) > > +#define MLX5_UAR_OFFSET (1ULL << (sizeof(uintptr_t) * 4)) > > + > > +/* Maximum number of UAR pages used by a port, > > + * These are the size and mask for an array of mutexes used to > > +synchronize > > + * the access to port's UARs on platforms that do not support 64 bit > > writes. > > + * In such systems it is possible to issue the 64 bits DoorBells > > +through two > > + * consecutive writes, each write 32 bits. The access to a UAR page > > +(which can > > + * be accessible by all threads in the process) must be synchronized > > + * (for example, using a semaphore). Such a synchronization is not > > +required > > + * when ringing DoorBells on different UAR pages. > > + * A port with 512 Tx queues uses 8, 4kBytes, UAR pages which are > > +shared > > + * among the ports. > > + */ > > +#define MLX5_UAR_PAGE_NUM_MAX 64 > > +#define MLX5_UAR_PAGE_NUM_MASK > ((MLX5_UAR_PAGE_NUM_MAX) > > - 1) > > > > /* Log 2 of the default number of strides per WQE for Multi-Packet > > RQ. */ #define MLX5_MPRQ_STRIDE_NUM_N 6U diff --git > > a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index > > 08dd559..820048f 100644 > > --- a/drivers/net/mlx5/mlx5_rxq.c > > +++ b/drivers/net/mlx5/mlx5_rxq.c > > @@ -643,7 +643,8 @@ > > doorbell = (uint64_t)doorbell_hi << 32; > > doorbell |= rxq->cqn; > > rxq->cq_db[MLX5_CQ_ARM_DB] = rte_cpu_to_be_32(doorbell_hi); > > - rte_write64(rte_cpu_to_be_64(doorbell), cq_db_reg); > > + mlx5_uar_write64(rte_cpu_to_be_64(doorbell), > > + cq_db_reg, rxq->uar_lock_cq); > > } > > > > /** > > @@ -1445,6 +1446,9 @@ struct mlx5_rxq_ctrl * > > tmpl->rxq.elts_n = log2above(desc); > > tmpl->rxq.elts = > > (struct rte_mbuf *(*)[1 << tmpl->rxq.elts_n])(tmpl + 1); > > +#ifndef RTE_ARCH_64 > > + tmpl->rxq.uar_lock_cq = &priv->uar_lock_cq; #endif > > tmpl->idx = idx; > > rte_atomic32_inc(&tmpl->refcnt); > > LIST_INSERT_HEAD(&priv->rxqsctrl, tmpl, next); diff --git > > a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index > > a7ed8d8..ec35ea0 100644 > > --- a/drivers/net/mlx5/mlx5_rxtx.c > > +++ b/drivers/net/mlx5/mlx5_rxtx.c > > @@ -495,6 +495,7 @@ > > volatile struct mlx5_wqe_ctrl *last_wqe = NULL; > > unsigned int segs_n = 0; > > const unsigned int max_inline = txq->max_inline; > > + uint64_t addr_64; > > > > if (unlikely(!pkts_n)) > > return 0; > > @@ -711,12 +712,12 @@ > > ds = 3; > > use_dseg: > > /* Add the remaining packet as a simple ds. */ > > - addr = rte_cpu_to_be_64(addr); > > + addr_64 = rte_cpu_to_be_64(addr); > > *dseg = (rte_v128u32_t){ > > rte_cpu_to_be_32(length), > > mlx5_tx_mb2mr(txq, buf), > > - addr, > > - addr >> 32, > > + addr_64, > > + addr_64 >> 32, > > }; > > ++ds; > > if (!segs_n) > > @@ -750,12 +751,12 @@ > > total_length += length; > > #endif > > /* Store segment information. */ > > - addr = rte_cpu_to_be_64(rte_pktmbuf_mtod(buf, > > uintptr_t)); > > + addr_64 = rte_cpu_to_be_64(rte_pktmbuf_mtod(buf, > > uintptr_t)); > > *dseg = (rte_v128u32_t){ > > rte_cpu_to_be_32(length), > > mlx5_tx_mb2mr(txq, buf), > > - addr, > > - addr >> 32, > > + addr_64, > > + addr_64 >> 32, > > }; > > (*txq->elts)[++elts_head & elts_m] = buf; > > if (--segs_n) > > @@ -1450,6 +1451,7 @@ > > unsigned int mpw_room = 0; > > unsigned int inl_pad = 0; > > uint32_t inl_hdr; > > + uint64_t addr_64; > > struct mlx5_mpw mpw = { > > .state = MLX5_MPW_STATE_CLOSED, > > }; > > @@ -1586,13 +1588,13 @@ > > ((uintptr_t)mpw.data.raw + > > inl_pad); > > (*txq->elts)[elts_head++ & elts_m] = buf; > > - addr = rte_cpu_to_be_64(rte_pktmbuf_mtod(buf, > > - uintptr_t)); > > + addr_64 = rte_cpu_to_be_64( > > + rte_pktmbuf_mtod(buf, uintptr_t)); > > *dseg = (rte_v128u32_t) { > > rte_cpu_to_be_32(length), > > mlx5_tx_mb2mr(txq, buf), > > - addr, > > - addr >> 32, > > + addr_64, > > + addr_64 >> 32, > > }; > > mpw.data.raw = (volatile void *)(dseg + 1); > > mpw.total_len += (inl_pad + sizeof(*dseg)); diff --git > > a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index > > 0007be0..2448d73 100644 > > --- a/drivers/net/mlx5/mlx5_rxtx.h > > +++ b/drivers/net/mlx5/mlx5_rxtx.h > > @@ -26,6 +26,8 @@ > > #include <rte_common.h> > > #include <rte_hexdump.h> > > #include <rte_atomic.h> > > +#include <rte_spinlock.h> > > +#include <rte_io.h> > > > > #include "mlx5_utils.h" > > #include "mlx5.h" > > @@ -115,6 +117,10 @@ struct mlx5_rxq_data { > > void *cq_uar; /* CQ user access region. */ > > uint32_t cqn; /* CQ number. */ > > uint8_t cq_arm_sn; /* CQ arm seq number. */ > > +#ifndef RTE_ARCH_64 > > + rte_spinlock_t *uar_lock_cq; > > + /* CQ (UAR) access lock required for 32bit implementations */ #endif > > uint32_t tunnel; /* Tunnel information. */ } __rte_cache_aligned; > > > > @@ -196,6 +202,10 @@ struct mlx5_txq_data { > > volatile void *bf_reg; /* Blueflame register remapped. */ > > struct rte_mbuf *(*elts)[]; /* TX elements. */ > > struct mlx5_txq_stats stats; /* TX queue counters. */ > > +#ifndef RTE_ARCH_64 > > + rte_spinlock_t *uar_lock; > > + /* UAR access lock required for 32bit implementations */ #endif > > } __rte_cache_aligned; > > > > /* Verbs Rx queue elements. */ > > @@ -348,6 +358,63 @@ uint16_t mlx5_rx_burst_vec(void *dpdk_txq, > struct > > rte_mbuf **pkts, uint32_t mlx5_rx_addr2mr_bh(struct mlx5_rxq_data > > *rxq, uintptr_t addr); uint32_t mlx5_tx_addr2mr_bh(struct > > mlx5_txq_data *txq, uintptr_t addr); > > > > +/** > > + * Provide safe 64bit store operation to mlx5 UAR region for both > > +32bit and > > + * 64bit architectures. > > + * > > + * @param val > > + * value to write in CPU endian format. > > + * @param addr > > + * Address to write to. > > + * @param lock > > + * Address of the lock to use for that UAR access. > > + */ > > +static __rte_always_inline void > > +__mlx5_uar_write64_relaxed(uint64_t val, volatile void *addr, > > + rte_spinlock_t *lock __rte_unused) { #ifdef > > RTE_ARCH_64 > > + rte_write64_relaxed(val, addr); > > +#else /* !RTE_ARCH_64 */ > > + rte_spinlock_lock(lock); > > + rte_write32_relaxed(val, addr); > > + rte_io_wmb(); > > + rte_write32_relaxed(val >> 32, > > + (volatile void *)((volatile char *)addr + 4)); > > + rte_spinlock_unlock(lock); > > +#endif > > +} > > + > > +/** > > + * Provide safe 64bit store operation to mlx5 UAR region for both > > +32bit and > > + * 64bit architectures while guaranteeing the order of execution with > > +the > > + * code being executed. > > + * > > + * @param val > > + * value to write in CPU endian format. > > + * @param addr > > + * Address to write to. > > + * @param lock > > + * Address of the lock to use for that UAR access. > > + */ > > +static __rte_always_inline void > > +__mlx5_uar_write64(uint64_t val, volatile void *addr, rte_spinlock_t > > +*lock) { > > + rte_io_wmb(); > > + __mlx5_uar_write64_relaxed(val, addr, lock); } > > + > > +/* Assist macros, used instead of directly calling the functions the > > +wrap. */ #ifdef RTE_ARCH_64 #define mlx5_uar_write64_relaxed(val, > > +dst, > > +lock) \ > > + __mlx5_uar_write64_relaxed(val, dst, NULL) #define > > +mlx5_uar_write64(val, dst, lock) __mlx5_uar_write64(val, dst, NULL) > > +#else #define mlx5_uar_write64_relaxed(val, dst, lock) \ > > + __mlx5_uar_write64_relaxed(val, dst, lock) #define > > +mlx5_uar_write64(val, dst, lock) __mlx5_uar_write64(val, dst, lock) > > +#endif > > + > > #ifndef NDEBUG > > /** > > * Verify or set magic value in CQE. > > @@ -614,7 +681,7 @@ uint16_t mlx5_rx_burst_vec(void *dpdk_txq, struct > > rte_mbuf **pkts, > > *txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci); > > /* Ensure ordering between DB record and BF copy. */ > > rte_wmb(); > > - *dst = *src; > > + mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock); > > if (cond) > > rte_wmb(); > > } > > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c > > index 669b913..dc786d4 100644 > > --- a/drivers/net/mlx5/mlx5_txq.c > > +++ b/drivers/net/mlx5/mlx5_txq.c > > @@ -255,6 +255,9 @@ > > struct mlx5_txq_ctrl *txq_ctrl; > > int already_mapped; > > size_t page_size = sysconf(_SC_PAGESIZE); > > +#ifndef RTE_ARCH_64 > > + unsigned int lock_idx; > > +#endif > > > > memset(pages, 0, priv->txqs_n * sizeof(uintptr_t)); > > /* > > @@ -281,7 +284,7 @@ > > } > > /* new address in reserved UAR address space. */ > > addr = RTE_PTR_ADD(priv->uar_base, > > - uar_va & (MLX5_UAR_SIZE - 1)); > > + uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1)); > > if (!already_mapped) { > > pages[pages_n++] = uar_va; > > /* fixed mmap to specified address in reserved @@ - > > 305,6 +308,12 @@ > > else > > assert(txq_ctrl->txq.bf_reg == > > RTE_PTR_ADD((void *)addr, off)); > > +#ifndef RTE_ARCH_64 > > + /* Assign a UAR lock according to UAR page number */ > > + lock_idx = (txq_ctrl->uar_mmap_offset / page_size) & > > + MLX5_UAR_PAGE_NUM_MASK; > > + txq->uar_lock = &priv->uar_lock[lock_idx]; #endif > > } > > return 0; > > } > > @@ -511,6 +520,8 @@ struct mlx5_txq_ibv * > > rte_atomic32_inc(&txq_ibv->refcnt); > > if (qp.comp_mask & MLX5DV_QP_MASK_UAR_MMAP_OFFSET) { > > txq_ctrl->uar_mmap_offset = qp.uar_mmap_offset; > > + DRV_LOG(DEBUG, "port %u: uar_mmap_offset 0x%lx", > > + dev->data->port_id, txq_ctrl->uar_mmap_offset); > > } else { > > DRV_LOG(ERR, > > "port %u failed to retrieve UAR info, invalid" > > -- > > 1.8.3.1