Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, build hints?

2021-09-30 Thread Raymond Muno via users
 Added -*-enable-mca-no-build=op-avx *to the configure line. Still dies in the same place. CCLD mca_op_avx.la ./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0): multiple definition of `ompi_op_avx_functions_avx2' ./.libs/liblocal_ops_avx2.a(liblocal_ops_a

[OMPI users] OpenMPI 4.0.2 with PGI 19.10, will not build with hcoll

2020-01-24 Thread Raymond Muno via users
I am having issues building OpenMPI 4.0.2 using the PGI 19.10 compilers.  OS is CentOS 7.7, MLNX_OFED 4.7.3 It dies at: PGC/x86-64 Linux 19.10-0: compilation completed with warnings   CCLD mca_coll_hcoll.la pgcc-Error-Unknown switch: -pthread make[2]: *** [mca_coll_hcoll.la] Error 1 make[2]

Re: [OMPI users] [External] Re: AMD EPYC 7281: does NOT, support binding memory to the process location

2020-01-08 Thread Raymond Muno via users
AMD, list the minimum supported kernel for EPYC/NAPLES as RHEL/Centos kernel 3.10-862, which is RHEL/CentOS 7.5 or later. Upgraded kernels can be used in 7.4. http://developer.amd.com/wp-content/resources/56420.pdf -Ray Muno On 1/8/20 7:37 PM, Raymond Muno wrote: We are running EPYC 7451 and

Re: [OMPI users] [External] Re: AMD EPYC 7281: does NOT, support binding memory to the process location

2020-01-08 Thread Raymond Muno via users
We are running EPYC 7451 and 7702 nodes.  I do not recall that CentOS 6 was able to support these. We moved on to CentOS 7.6 at first and are now running 7.7 to support the EPYC2/Rome nodes. The kernel in earlier releases did not support x2APIC and could not handle 256 threads. Not and issue on

Re: [OMPI users] [External] Re: AMD EPYC 7281: does NOT, support binding memory to the process location

2020-01-08 Thread Raymond Muno via users
We are running EPYC 7451 and 7702 nodes.  I do not recall that CentOS 6 was able to support these. We moved on to CentOS 7.6 at first and are now running 7.7 to support the EPYC2/Rome nodes. The kernel in earlier releases did not support x2APIC and could not handle 256 threads. Not and issue on

[OMPI users] Parameters at run time

2019-10-19 Thread Raymond Muno via users
Is there a way to determine, at run time, as to what choices OpenMPI made in terms of transports that are being utilized?  We want to verify we are running UCX over Infiniband. I have two users, executing the identical code, with the same mpirun options, getting vastly different execution time

Re: [OMPI users] UCX errors after upgrade

2019-10-02 Thread Raymond Muno via users
) wrote: Thanks Raymond; I have filed an issue for this on Github and tagged the relevant Mellanox people: https://github.com/open-mpi/ompi/issues/7009 On Sep 25, 2019, at 3:09 PM, Raymond Muno via users mailto:users@lists.open-mpi.org>> wrote: We are running against 4.0.2RC2 now. T

Re: [OMPI users] UCX errors after upgrade

2019-09-25 Thread Raymond Muno via users
As a test, I rebooted a set of nodes. The user could run on 480 cores, on 5 nodes. We could not run beyond two nodes previous to that. We still get the VM_UNMAP warning, however. On 9/25/19 2:09 PM, Raymond Muno via users wrote: We are running against 4.0.2RC2 now. This is ussing current

Re: [OMPI users] UCX errors after upgrade

2019-09-25 Thread Raymond Muno via users
od bug fixes in there since v4.0.1. On Sep 25, 2019, at 2:12 PM, Raymond Muno via users mailto:users@lists.open-mpi.org>> wrote: We are primarily using OpenMPI 3.1.4 but also have 4.0.1 installed. On our cluster, we were running CentOS 7.5 with updates, alongside MLNX_OFED 4.5.x.  

[OMPI users] UCX errors after upgrade

2019-09-25 Thread Raymond Muno via users
We are primarily using OpenMPI 3.1.4 but also have 4.0.1 installed. On our cluster, we were running CentOS 7.5 with updates, alongside MLNX_OFED 4.5.x.   OpenMPI was compiled with GCC, Intel, PGI and AOCC compilers. We could run with no issues. To accommodate updates needed to get our IB gear