I ran into a problem with 4.1.0 several weeks ago,
and no longer recall precisely how; I am now rebuilding
both 4.1.0 and a recent 4.1.x, then will use them to
build GROMACS, probably the application I was attemping
back then.

But I do have this from my notes (for 4.1.0):

mpicc -fopenmp hybrid_hello.c
export OMP_NUM_THREADS=2
mpirun -np 2 ./a.out
# [server.clearlight.us:18349] mca_base_component_repository_open: unable to 
open mca_op_avx: 
/home/maxd/XXXXXXXXXXXXXXXXXX/opt/gnu/openmpi-4.1.0_gcc-10.2.0/lib/openmpi/mca_op_avx.so:
 undefined symbol: ompi_op_avx_functions_avx512 (ignored)
# [server.clearlight.us:18348] mca_base_component_repository_open: unable to 
open mca_op_avx: 
/home/maxd/XXXXXXXXXXXXXXXXXX/opt/gnu/openmpi-4.1.0_gcc-10.2.0/lib/openmpi/mca_op_avx.so:
 undefined symbol: ompi_op_avx_functions_avx512 (ignored)
# Hello from thread 0 out of 2 from process 0 out of 2 on server.clearlight.us
# Hello from thread 1 out of 2 from process 0 out of 2 on server.clearlight.us
# Hello from thread 0 out of 2 from process 1 out of 2 on server.clearlight.us
# Hello from thread 1 out of 2 from process 1 out of 2 on server.clearlight.us

(where I X-ed out confidential details).  Not an error,
but surely indicative of something amiss.

More to come!


On Thu, Feb 11, 2021 at 02:02:48AM +0000, Jeff Squyres (jsquyres) via users 
wrote:
> I think Max did try the latest 4.1 nightly build (from an off-list email), 
> and his problem still persisted.
> 
> Max: can you describe exactly how Open MPI failed?  All you said was:
> 
> >> Consequently AVX512 intrinsic functions were erroneously
> >> deployed, resulting in OpenMPI failure.
> 
> Can you provide more details?
> 
> 
> > On Feb 10, 2021, at 6:09 PM, Gilles Gouaillardet via users 
> > <users@lists.open-mpi.org> wrote:
> > 
> > Max,
> > 
> > at configure time, Open MPI detects the *compiler* capabilities.
> > In your case, your compiler can emit AVX512 code.
> > (and fwiw, the tests are only compiled and never executed)
> > 
> > Then at *runtime*, Open MPI detects the *CPU* capabilities.
> > In your case, it should not invoke the functions containing AVX512 code.
> > 
> > That being said, several changes were made to the op/avx component,
> > so if you are experiencing some crashes, I do invite you to give a try to 
> > the
> > latest nightly snapshot for the v4.1.x branch.
> > 
> > 
> > Cheers,
> > 
> > Gilles
> > 
> > On Wed, Feb 10, 2021 at 10:43 PM Max R. Dechantsreiter via users
> > <users@lists.open-mpi.org> wrote:
> >> 
> >> Configuring OpenMPI 4.1.0 with GCC 10.2.0 on
> >> Intel(R) Xeon(R) CPU E5-2620 v3, a Haswell processor
> >> that supports AVX2 but not AVX512, resulted in
> >> 
> >> checking for AVX512 support (no additional flags)... no
> >> checking for AVX512 support (with -march=skylake-avx512)... yes
> >> 
> >> in "configure" output, and in config.log
> >> 
> >> MCA_BUILD_ompi_op_has_avx512_support_FALSE='#'
> >> MCA_BUILD_ompi_op_has_avx512_support_TRUE=''
> >> 
> >> Consequently AVX512 intrinsic functions were erroneously
> >> deployed, resulting in OpenMPI failure.
> >> 
> >> The relevant test code was in essence
> >> 
> >> cat > conftest.c << EOF
> >> #include <immintrin.h>
> >> 
> >> int main()
> >> {
> >>        __m512 vA, vB;
> >> 
> >>        _mm512_add_ps(vA, vB);
> >> 
> >>        return 0;
> >> }
> >> EOF
> >> 
> >> The problem with this is that the result of the function
> >> is never used, so at optimization level higher than O0
> >> the compiler elimates the function as "dead code" (DCE).
> >> To wit,
> >> 
> >> gcc -O3 -march=skylake-avx512 -S conftest.c
> >> 
> >> yields
> >> 
> >>        .file   "conftest.c"
> >>        .text
> >>        .section        .text.startup,"ax",@progbits
> >>        .p2align 4
> >>        .globl  main
> >>        .type   main, @function
> >> main:
> >> .LFB5345:
> >>        .cfi_startproc
> >>        xorl    %eax, %eax
> >>        ret
> >>        .cfi_endproc
> >> .LFE5345:
> >>        .size   main, .-main
> >>        .ident  "GCC: (GNU) 10.2.0"
> >>        .section        .note.GNU-stack,"",@progbits
> >> 
> >> Compare this with the result of
> >> 
> >> gcc -O0 -march=skylake-avx512 -S conftest.c
> >> 
> >> in which the function IS called:
> >> 
> >>        .file   "conftest.c"
> >>        .text
> >>        .globl  main
> >>        .type   main, @function
> >> main:
> >> .LFB4092:
> >>        .cfi_startproc
> >>        pushq   %rbp
> >>        .cfi_def_cfa_offset 16
> >>        .cfi_offset 6, -16
> >>        movq    %rsp, %rbp
> >>        .cfi_def_cfa_register 6
> >>        andq    $-64, %rsp
> >>        subq    $136, %rsp
> >>        vmovaps 72(%rsp), %zmm0
> >>        vmovaps %zmm0, -56(%rsp)
> >>        vmovaps 8(%rsp), %zmm0
> >>        vmovaps %zmm0, -120(%rsp)
> >>        movl    $0, %eax
> >>        leave
> >>        .cfi_def_cfa 7, 8
> >>        ret
> >>        .cfi_endproc
> >> .LFE4092:
> >>        .size   main, .-main
> >>        .ident  "GCC: (GNU) 10.2.0"
> >>        .section        .note.GNU-stack,"",@progbits
> >> 
> >> Note the use of a 512-bit ZMM register - ZMM registers
> >> are used only by AVX512 instructions.  Hence at O3 the
> >> test program does not detect the lack of AVX512 support
> >> by the host processor.
> >> 
> >> An easy remedy would be to declare the operands as
> >> "volatile" and thereby force to compiler to invoke the
> >> function:
> >> 
> >> cat > conftest.c << EOF
> >> #include <immintrin.h>
> >> 
> >> int main()
> >> {
> >>        volatile __m512 vA, vB;
> >> 
> >>        _mm512_add_ps(vA, vB);
> >> 
> >>        return 0;
> >> }
> >> 
> >> Compiled at O3, the resulting executable dumps core as it
> >> should when run on my Haswell processor, returning nonzero
> >> exit status ($?), which would inform "configure" that the
> >> processor does not have AVX512 capability.
> >> 
> >> Finally please note that this error could affect the
> >> detection of support for other instruction sets on other
> >> families of processors: compiler optimization must be
> >> inhibited for such tests to be reliable!
> >> 
> >> Max
> >> ---
> >> Max R. Dechantsreiter
> >> President
> >> Performance Jones L.L.C.
> >> m...@performancejones.com
> >> Skype: PerformanceJones (UTC+01:00)
> >> +1 414 446-3100 (telephone/voicemail)
> >> http://www.linkedin.com/in/benchmarking
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> 

Reply via email to