> On Sep 12, 2018, at 1:56 PM, Yongseok Koh <ys...@mellanox.com> wrote: > > Hi, Christian > > We've recently encountered a weird issue with Ubuntu 18.04 on the Skylake > server. I can always reproduce this crash and I could narrowed it down. I > guess > it could be a GCC issue. > > > [1] How to reproduce > - ConnectX-4Lx/ConnectX-5 with mlx5 PMD in DPDK 18.02.1 > - Ubuntu 18.04 on Intel Skylake server > - gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0 > - Testpmd crashes when it starts to forward traffic. Easy to reproduce. > - Only happens on the Skylake server. > - DPDK 18.05 and later don't have such issue. git-bisect gives no clue.
This is because I enabled MEMPOOL_DEBUG and MLX5_DEBUG. As mempool/rte_memcpy is inlined function, it should be affected. Now I can see the crash regardlessly - 18.02, 18.05 and 18.08. Thanks, Yongseok. > > > [2] Failure point > > The attached patch gives an insight of why it crashes. The following is the > result of the patch and the GDB commands. > > In summary, rte_memcpy() doesn't work as expected. In __mempool_generic_put(), > there's rte_memcpy() to move the array of objects to the lcore cache. If I run > memcmp() right after rte_memcpy(dst, src, n), data in dst differs from data in > src. And it looks like some of data got shifted by a few bytes as you can see > below. > > [GDB command] > $dst = 0x7ffff4e09ea8 > $src = 0x7fffce3fb970 > $n = 256 > x/32gx 0x7ffff4e09ea8 > x/32gx 0x7fffce3fb970 > testpmd: /home/mlnxtest/dpdk/build/include/rte_mempool.h:1140: > __mempool_generic_put: Assertion `0' failed. > > Thread 4 "lcore-slave-1" received signal SIGABRT, Aborted. > [Switching to Thread 0x7fffce3ff700 (LWP 69913)] > (gdb) x/32gx 0x7ffff4e09ea8 > 0x7ffff4e09ea8: 0x00007fffaac38ec0 0x00007fffaac38500 > 0x7ffff4e09eb8: 0x00007fffaac37b40 0x00007fffaac37180 > 0x7ffff4e09ec8: 0x850000007fffaac3 0x7b4000007fffaac3 > 0x7ffff4e09ed8: 0x00007fffaac35440 0x00007fffaac34a80 > 0x7ffff4e09ee8: 0xaac3850000007fff 0xaac37b4000007fff > 0x7ffff4e09ef8: 0x00007fffaac32d40 0x00007fffaac32380 > 0x7ffff4e09f08: 0x7fffaac385000000 0x7fffaac37b400000 > 0x7ffff4e09f18: 0x00007fffaac30640 0x00007fffaac2fc80 > 0x7ffff4e09f28: 0x00007fffaac2f2c0 0x00007fffaac2e900 > 0x7ffff4e09f38: 0x00007fffaac2df40 0x00007fffaac2d580 > 0x7ffff4e09f48: 0x00007fffaac2cbc0 0x00007fffaac2c200 > 0x7ffff4e09f58: 0x00007fffaac2b840 0x00007fffaac2ae80 > 0x7ffff4e09f68: 0x00007fffaac2a4c0 0x00007fffaac29b00 > 0x7ffff4e09f78: 0x00007fffaac29140 0x00007fffaac28780 > 0x7ffff4e09f88: 0x00007fffaac27dc0 0x00007fffaac27400 > 0x7ffff4e09f98: 0x00007fffaac26a40 0x00007fffaac26080 > (gdb) x/32gx 0x7fffce3fb970 > 0x7fffce3fb970: 0x00007fffaac38ec0 0x00007fffaac38500 > 0x7fffce3fb980: 0x00007fffaac37b40 0x00007fffaac37180 > 0x7fffce3fb990: 0x00007fffaac367c0 0x00007fffaac35e00 > 0x7fffce3fb9a0: 0x00007fffaac35440 0x00007fffaac34a80 > 0x7fffce3fb9b0: 0x00007fffaac340c0 0x00007fffaac33700 > 0x7fffce3fb9c0: 0x00007fffaac32d40 0x00007fffaac32380 > 0x7fffce3fb9d0: 0x00007fffaac319c0 0x00007fffaac31000 > 0x7fffce3fb9e0: 0x00007fffaac30640 0x00007fffaac2fc80 > 0x7fffce3fb9f0: 0x00007fffaac2f2c0 0x00007fffaac2e900 > 0x7fffce3fba00: 0x00007fffaac2df40 0x00007fffaac2d580 > 0x7fffce3fba10: 0x00007fffaac2cbc0 0x00007fffaac2c200 > 0x7fffce3fba20: 0x00007fffaac2b840 0x00007fffaac2ae80 > 0x7fffce3fba30: 0x00007fffaac2a4c0 0x00007fffaac29b00 > 0x7fffce3fba40: 0x00007fffaac29140 0x00007fffaac28780 > 0x7fffce3fba50: 0x00007fffaac27dc0 0x00007fffaac27400 > 0x7fffce3fba60: 0x00007fffaac26a40 0x00007fffaac26080 > > > AFAIK, AVX512F support is disabled by default in DPDK as it is still > experimental (CONFIG_RTE_ENABLE_AVX512=n). But with gcc optimization, AVX2 > version of rte_memcpy() seems to be optimized with 512b instructions. If I > disable it by adding EXTRA_CFLAGS="-mno-avx512f", then it works fine and > doesn't > crash. > > Do you have any idea regarding this issue or are you already aware of it? > > > Thanks, > Yongseok > > > $ git diff > diff --git a/config/common_base b/config/common_base > index ad03cf433..f512b5a88 100644 > --- a/config/common_base > +++ b/config/common_base > @@ -275,8 +275,8 @@ CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8 > # > # Compile burst-oriented Mellanox ConnectX-4 & ConnectX-5 (MLX5) PMD > # > -CONFIG_RTE_LIBRTE_MLX5_PMD=n > -CONFIG_RTE_LIBRTE_MLX5_DEBUG=n > +CONFIG_RTE_LIBRTE_MLX5_PMD=y > +CONFIG_RTE_LIBRTE_MLX5_DEBUG=y > CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS=n > CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8 > > @@ -597,7 +597,7 @@ CONFIG_RTE_RING_USE_C11_MEM_MODEL=n > # > CONFIG_RTE_LIBRTE_MEMPOOL=y > CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE=512 > -CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=n > +CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=y > > # > # Compile Mempool drivers > diff --git a/lib/librte_mempool/rte_mempool.h > b/lib/librte_mempool/rte_mempool.h > index 8b1b7f7ed..9f48028d9 100644 > --- a/lib/librte_mempool/rte_mempool.h > +++ b/lib/librte_mempool/rte_mempool.h > @@ -39,6 +39,7 @@ > #include <errno.h> > #include <inttypes.h> > #include <sys/queue.h> > +#include <assert.h> > > #include <rte_config.h> > #include <rte_spinlock.h> > @@ -1123,6 +1124,22 @@ __mempool_generic_put(struct rte_mempool *mp, void * > const *obj_table, > /* Add elements back into the cache */ > rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n); > > + if(memcmp(&cache_objs[0], obj_table, sizeof(void *) * n)) { > + printf("[GDB command] \n" > + "$dst = %p\n" > + "$src = %p\n" > + "$n = %ld\n" > + "x/%ldgx %p\n" > + "x/%ldgx %p\n", > + (void *)&cache_objs[0], > + (const void *)obj_table, > + sizeof(void *) * n, > + sizeof(void *) * n / 8, (void *)&cache_objs[0], > + sizeof(void *) * n / 8, (const void *)obj_table > + ); > + assert(0); > + } > + > cache->len += n; > > if (cache->len >= cache->flushthresh) { > >