Re: [OMPI users] Building PMIx and Slurm support

2019-03-11 Thread Gilles Gouaillardet
Passant, Except the typo (it should be srun --mpi=pmix_v3), there is nothing wrong with that, and it is working just fine for me (same SLURM version, same PMIx version, same Open MPI version and same Open MPI configure command line) that is why I asked you some more information/logs in ord

Re: [OMPI users] Building PMIx and Slurm support

2019-03-11 Thread Passant A. Hafez
Hello Gilles, Yes I do use srun --mpi=pmix_3 to run the app, what's the problem with that? Before that, when we tried to launch MPI apps directly with srun, we got the error message saying Slurm missed the PMIx support, that's why we proceeded with the installation. All the best, -- Passant

Re: [OMPI users] Building PMIx and Slurm support

2019-03-11 Thread Gilles Gouaillardet
Passant, I built a similar environment, and had no issue running a simple MPI program. Can you please post your slurm script (I assume it uses srun to start the MPI app), the output of scontrol show config | grep Mpi and the full output of your job ? Cheers, Gilles On 3/12/2019 7:

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Gilles Gouaillardet
Michael, this is odd, I will have a look. Can you confirm you are running on a single node ? At first, you need to understand which component is used by Open MPI for communications. There are several options here, and since I do not know how Open MPI was built, nor which dependencies are

Re: [OMPI users] Building PMIx and Slurm support

2019-03-11 Thread Passant A. Hafez
Hello, So we now have Slurm 18.08.6-2 compiled with PMIx 3.1.2 then I installed openmpi 4.0.0 with: --with-slurm --with-pmix=internal --with-libevent=internal --enable-shared --enable- static --with-x (Following the thread, it was mentioned that building OMPI 4.0.0 with PMIx 3.1.2 will fa

[OMPI users] Web page update needed?

2019-03-11 Thread Bennet Fauber
>From the web page at https://www.open-mpi.org/nightly/ Before deciding which series to download, be sure to read Open MPI's philosophy on version numbers. The short version is that odd numbered release series are "feature" series that eventually morph into even numbered "super stable

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
On Mon, Mar 11, 2019 at 12:09 PM Gilles Gouaillardet wrote: > You can force > mpirun --mca pml ob1 ... > And btl/vader (shared memory) will be used for intra node communications ... > unless MPI tasks are from different jobs (read MPI_Comm_spawn()) if i run mpirun -n 16 IMB-MPI1 alltoallv thing

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
On Mon, Mar 11, 2019 at 12:19 PM Ralph H Castain wrote: > OFI uses libpsm2 underneath it when omnipath detected > > > On Mar 11, 2019, at 9:06 AM, Gilles Gouaillardet > > wrote: > > It might show that pml/cm and mtl/psm2 are used. In that case, then yes, > > the OmniPath library is used even fo

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Ralph H Castain
OFI uses libpsm2 underneath it when omnipath detected Sent from my iPhone > On Mar 11, 2019, at 9:06 AM, Gilles Gouaillardet > wrote: > > Michael, > > You can > > mpirun --mca pml_base_verbose 10 --mca btl_base_verbose 10 --mca > mtl_base_verbose 10 ... > > It might show that pml/cm and m

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Gilles Gouaillardet
Michael, You can mpirun --mca pml_base_verbose 10 --mca btl_base_verbose 10 --mca mtl_base_verbose 10 ... It might show that pml/cm and mtl/psm2 are used. In that case, then yes, the OmniPath library is used even for intra node communications. If this library is optimized for intra node, then

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
On Mon, Mar 11, 2019 at 11:51 AM Ralph H Castain wrote: > You are probably using the ofi mtl - could be psm2 uses loopback method? according to ompi_info i do in fact have mtl's ofi,psm,psm2. i haven't changed any of the defaults, so are you saying order to change the behaviour i have to run mpi

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Ralph H Castain
You are probably using the ofi mtl - could be psm2 uses loopback method? Sent from my iPhone > On Mar 11, 2019, at 8:40 AM, Michael Di Domenico > wrote: > > i have a user that's claiming when two ranks on the same node want to > talk with each other, they're using the NIC to talk rather then j

[OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
i have a user that's claiming when two ranks on the same node want to talk with each other, they're using the NIC to talk rather then just talking directly. i've never had to test such a scenario. is there a way for me to prove one way or another whether two ranks are talking through say the kern