Actually, can you try with the latest v4.1.x nightly snapshot tarball?

We have fixed several AVX-related issues since v4.1.0 was released:

    https://www.open-mpi.org/nightly/v4.1.x/



On Feb 5, 2021, at 2:54 PM, George Bosilca via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:

Carl,

AVX support was introduced in 4.1 which explains why you did not have such 
issues before. What is your configure command in these 2 cases ? Please create 
an issue on github and attach your config.log.

  George.



On Fri, Feb 5, 2021 at 2:44 PM Carl Ponder via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
Building OpenMPI 4.1.0 with the PGI 21.1 compiler on a Broadwell processor, I 
get this error
libtool: compile:  pgcc -DHAVE_CONFIG_H -I. -I../../../../opal/include 
-I../../../../ompi/include -I../../../../oshmem/include 
-I../../../../opal/mca/hwloc/hwloc201/hwloc/include/private/autogen 
-I../../../../opal/mca/hwloc/hwloc201/hwloc/include/hwloc/autogen 
-I../../../../ompi/mpiext/cuda/c -DGENERATE_SSE3_CODE -DGENERATE_SSE41_CODE 
-DGENERATE_AVX_CODE -DGENERATE_AVX2_CODE -DGENERATE_AVX512_CODE -I../../../.. 
-I../../../../orte/include 
-I/gpfs/fs1/SHARE/Utils/PGI/21.1/Linux_x86_64/21.1/cuda/11.2/include 
-DUCS_S_PACKED= -I/gpfs/fs1/SHARE/Utils/ZLib/1.2.11/PGI-21.1/include 
-I/gpfs/fs1/SHARE/Utils/HWLoc/2.4.0/PGI-21.1_CUDA-11.2.0.0_460.27.04/include 
-I/usr/local/include -I/usr/local/include -march=skylake-avx512 -O3 -DNDEBUG 
-m64 -tp=px -Mnodalign -fno-strict-aliasing -c op_avx_functions.c -MD  -fPIC 
-DPIC -o .libs/liblocal_ops_avx512_la-op_avx_functions.o
LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse3.ldu.dq
Makefile:1993: recipe for target 'liblocal_ops_avx512_la-op_avx_functions.lo' 
failed
make[2]: *** [liblocal_ops_avx512_la-op_avx_functions.lo] Error 1
make[2]: Leaving directory 
'/gpfs/fs1/SHARE/Utils/OpenMPI/4.1.0/PGI-21.1_CUDA-11.2.0.0_460.27.04_UCX-1.10.0-rc2_HWLoc-2.4.0_ZLib-1.2.11/distro/ompi/mca/op/avx'
and the GCC 10.2.0 compiler gives me errors like this:
op_avx_functions.c: In function ‘ompi_op_avx_2buff_bxor_uint64_t_avx512’:
op_avx_functions.c:208:21: warning: AVX512F vector return without AVX512F 
enabled changes the ABI [-Wpsabi]
  208 |             __m512i vecA =  _mm512_loadu_si512((__m512i*)in);           
\
      |                     ^~~~
op_avx_functions.c:263:5: note: in expansion of macro ‘OP_AVX_AVX512_BIT_FUNC’
  263 |     OP_AVX_AVX512_BIT_FUNC(name, type_size, type, op);                  
\
      |     ^~~~~~~~~~~~~~~~~~~~~~
op_avx_functions.c:573:5: note: in expansion of macro ‘OP_AVX_BIT_FUNC’
  573 |     OP_AVX_BIT_FUNC(bxor, 64, uint64_t, xor)
      |     ^~~~~~~~~~~~~~~
In file included from 
/gpfs/fs1/SHARE/Utils/GCC/10.2.0/GCC-BASE-7.5.0_GMP-6.2.1_ISL-0.23_MPFR-4.1.0_MPC-1.2.1_CUDA-11.2.0.0_460.27.04/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/immintrin.h:55,
                 from op_avx_functions.c:26:
op_avx_functions.c: In function ‘ompi_op_avx_2buff_max_int8_t_avx512’:
/gpfs/fs1/SHARE/Utils/GCC/10.2.0/GCC-BASE-7.5.0_GMP-6.2.1_ISL-0.23_MPFR-4.1.0_MPC-1.2.1_CUDA-11.2.0.0_460.27.04/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/avx512fintrin.h:6429:1:
 error: inlining failed in call
to ‘always_inline’ ‘_mm512_storeu_si512’: target specific option mismatch
I can get 4.1.0 to build with GCC by removing these flags
-march=corei7-avx -mtune=corei7-avx
and PGI by removing this flag
-tp=px
I didn't have these issues with the OpenMPI 4.0.4 source. Is there a bug in the 
4.1.0?


--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>

Reply via email to