Hi Bennet,

Three things...

1. OpenMPI 2.x requires PMIx in lieu of pmi1/pmi2.

2. You will need slurm 16.05 or greater built with —with-pmix

2a. You will need pmix 1.1.5 which you can get from github. 
(https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pmix_tarballs&d=DwIFaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=8sBODgXZKw_dNqkFqkTqbGD3_7nNlm_pat-D6AqiaC8&m=c-BYSHbQBLKztnmjE6vyXD1qJPjhdol-A6vS7z11_CY&s=8l86GZPJBXZP3xA9iy-tZFiPJ9fhG82mcOFjzz04gRE&e=
  
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pmix_tarballs&d=DwIFaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=8sBODgXZKw_dNqkFqkTqbGD3_7nNlm_pat-D6AqiaC8&m=c-BYSHbQBLKztnmjE6vyXD1qJPjhdol-A6vS7z11_CY&s=8l86GZPJBXZP3xA9iy-tZFiPJ9fhG82mcOFjzz04gRE&e=
 >).

3. then, to launch your mpi tasks on the allocated resources, 

   srun —mpi=pmix ./hello-mpi

I’m replying to the list because,

a) this information is harder to find than you might think.
b) someone/anyone can correct me if I’’m giving a bum steer.

Hope this helps,

Charlie Taylor
University of Florida

> On Nov 16, 2017, at 10:34 AM, Bennet Fauber <ben...@umich.edu> wrote:
> 
> I think that OpenMPI is supposed to support SLURM integration such that
> 
>    srun ./hello-mpi
> 
> should work?  I built OMPI 2.1.2 with
> 
> export CONFIGURE_FLAGS='--disable-dlopen --enable-shared'
> export COMPILERS='CC=gcc CXX=g++ FC=gfortran F77=gfortran'
> 
> CMD="./configure \
>    --prefix=${PREFIX} \
>    --mandir=${PREFIX}/share/man \
>    --with-slurm \
>    --with-pmi \
>    --with-lustre \
>    --with-verbs \
>    $CONFIGURE_FLAGS \
>    $COMPILERS
> 
> I have a simple hello-mpi.c (source included below), which compiles
> and runs with mpirun, both on the login node and in a job.  However,
> when I try to use srun in place of mpirun, I get instead a hung job,
> which upon cancellation produces this output.
> 
> [bn2.stage.arc-ts.umich.edu:116377] PMI_Init [pmix_s1.c:162:s1_init]:
> PMI is not initialized
> [bn1.stage.arc-ts.umich.edu:36866] PMI_Init [pmix_s1.c:162:s1_init]:
> PMI is not initialized
> [warn] opal_libevent2022_event_active: event has no event_base set.
> [warn] opal_libevent2022_event_active: event has no event_base set.
> slurmstepd: error: *** STEP 86.0 ON bn1 CANCELLED AT 2017-11-16T10:03:24 ***
> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
> slurmstepd: error: *** JOB 86 ON bn1 CANCELLED AT 2017-11-16T10:03:24 ***
> 
> The SLURM web page suggests that OMPI 2.x and later support PMIx, and
> to use `srun --mpi=pimx`, however that no longer seems to be an
> option, and using the `openmpi` type isn't working (neither is pmi2).
> 
> [bennet@beta-build hello]$ srun --mpi=list
> srun: MPI types are...
> srun: mpi/pmi2
> srun: mpi/lam
> srun: mpi/openmpi
> srun: mpi/mpich1_shmem
> srun: mpi/none
> srun: mpi/mvapich
> srun: mpi/mpich1_p4
> srun: mpi/mpichgm
> srun: mpi/mpichmx
> 
> To get the Intel PMI to work with srun, I have to set
> 
>    I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
> 
> Is there a comparable environment variable that must be set to enable
> `srun` to work?
> 
> Am I missing a build option or misspecifying one?
> 
> -- bennet
> 
> 
> Source of hello-mpi.c
> ==========================================
> #include <stdio.h>
> #include <stdlib.h>
> #include "mpi.h"
> 
> int main(int argc, char **argv){
> 
>  int rank;          /* rank of process */
>  int numprocs;      /* size of COMM_WORLD */
>  int namelen;
>  int tag=10;        /* expected tag */
>  int message;       /* Recv'd message */
>  char processor_name[MPI_MAX_PROCESSOR_NAME];
>  MPI_Status status; /* status of recv */
> 
>  /* call Init, size, and rank */
>  MPI_Init(&argc, &argv);
>  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
>  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>  MPI_Get_processor_name(processor_name, &namelen);
> 
>  printf("Process %d on %s out of %d\n", rank, processor_name, numprocs);
> 
>  if(rank != 0){
>    MPI_Recv(&message,    /*buffer for message */
>                    1,    /*MAX count to recv */
>              MPI_INT,    /*type to recv */
>                    0,    /*recv from 0 only */
>                  tag,    /*tag of messgae */
>       MPI_COMM_WORLD,    /*communicator to use */
>              &status);   /*status object */
>    printf("Hello from process %d!\n",rank);
>  }
>  else{
>    /* rank 0 ONLY executes this */
>    printf("MPI_COMM_WORLD is %d processes big!\n", numprocs);
>    int x;
>    for(x=1; x<numprocs; x++){
>       MPI_Send(&x,          /*send x to process x */
>                 1,          /*number to send */
>           MPI_INT,          /*type to send */
>                 x,          /*rank to send to */
>               tag,          /*tag for message */
>     MPI_COMM_WORLD);        /*communicator to use */
>    }
>  } /* end else */
> 
> 
> /* always call at end */
> MPI_Finalize();
> 
> return 0;
> }
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwICAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=t2C9i2WW8vYudLmnfvtKjpqTlBguLeivBwHAaQ1TcM4&s=aakHf5ypdTOe4-hQ86pcEN9FmiW1Xyngln5ODOUwCqQ&e=
>  

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to