Re: [OMPI users] opal_path_nfs freeze

2020-04-23 Thread Patrick Bégou via users
Hi Jeff As we say in french "dans le mille!" you were right. I'm not the admin of these servers and a "mpirun not found" was sufficient in my mind. It wasn't. As I had deployed OpenMPI 4.0.2 I launch a new build after setting my LD_LIBRARY_PATH to reach OpenMPI4.0.2 installed libs before all othe

[OMPI users] Can't start jobs with srun.

2020-04-23 Thread Prentice Bisbal via users
I'm using OpenMPI 4.0.3 with Slurm 19.05.5  I'm testing the software with a very simple hello, world MPI program that I've used reliably for years. When I submit the job through slurm and use srun to launch the job, I get these errors: *** An error occurred in MPI_Init *** on a NULL communicat

Re: [OMPI users] Can't start jobs with srun.

2020-04-23 Thread Ralph Castain via users
Is Slurm built with PMIx support? Did you tell srun to use it? > On Apr 23, 2020, at 7:00 AM, Prentice Bisbal via users > wrote: > > I'm using OpenMPI 4.0.3 with Slurm 19.05.5 I'm testing the software with a > very simple hello, world MPI program that I've used reliably for years. When > I

Re: [OMPI users] [External] Re: Can't start jobs with srun.

2020-04-23 Thread Prentice Bisbal via users
It looks like it was built with PMI2, but not PMIx: $ srun --mpi=list srun: MPI types are... srun: none srun: pmi2 srun: openmpi I did launch the job with srun --mpi=pmi2 Does OpenMPI 4 need PMIx specifically? On 4/23/20 10:23 AM, Ralph Castain via users wrote: Is Slurm built with PMIx

Re: [OMPI users] [External] Re: Can't start jobs with srun.

2020-04-23 Thread Ralph Castain via users
No, but you do have to explicitly build OMPI with non-PMIx support if that is what you are going to use. In this case, you need to configure OMPI --with-pmi2= You can leave off the path if Slurm (i.e., just "--with-pmi2") was installed in a standard location as we should find it there. > On A

Re: [OMPI users] opal_path_nfs freeze

2020-04-23 Thread Jeff Squyres (jsquyres) via users
On Apr 23, 2020, at 8:50 AM, Patrick Bégou wrote: > > As we say in french "dans le mille!" you were right. > I'm not the admin of these servers and a "mpirun not found" was sufficient in > my mind. It wasn't. > > As I had deployed OpenMPI 4.0.2 I launch a new build after setting my > LD_LIBRA

Re: [OMPI users] [External] Re: Can't start jobs with srun.

2020-04-23 Thread Prentice Bisbal via users
--mpi=list shows pmi2 and openmpi as valid values, but if I set --mpi= to either of them, my job still fails. Why is that? Can I not trust the output of --mpi=list? Prentice On 4/23/20 10:43 AM, Ralph Castain via users wrote: No, but you do have to explicitly build OMPI with non-PMIx support

Re: [OMPI users] [External] Re: Can't start jobs with srun.

2020-04-23 Thread Ralph Castain via users
You can trust the --mpi=list. The problem is likely that OMPI wasn't configured --with-pmi2 > On Apr 23, 2020, at 11:59 AM, Prentice Bisbal via users > wrote: > > --mpi=list shows pmi2 and openmpi as valid values, but if I set --mpi= to > either of them, my job still fails. Why is that? Can