Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available processors)" when running multiple jobs concurrently

2024-04-17 Thread Mehmet Oren via users
Hi Greg, I am not an openmpi expert but I just wanted to share my experience with HPC-X. 1. Default HPC-X builds which come with the mofed drivers are built with UCX and as Gilles stated, specifying ob1 will not change the layer for openmpi. You can try to discard UCX and let the openmpi deci

Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available processors)" when running multiple jobs concurrently

2024-04-17 Thread Greg Samonds via users
Hi Mehmet, Gilles, Thanks for your support on this topic. * I gave "--mca pml ^ucx" a try but unfortunately the jobs failed with “ MPI_INIT has failed because at least one MPI process is unreachable from another”. * We use a Python-based launcher which launches an mpiexec command throu