Hi,
Recently our Slurm system has been upgraded to 19.0.5. I tried to recompile openmpi v3.0 due to the bug reported in https://bugs.schedmd.com/show_bug.cgi?id=6993 The configure flags are: $./configure --enable-shared --enable-static --with-slurm --with-pmix and the output of ompi_info is following $ ompi_info -a |grep pmix Configure command line: '--enable-shared' '--enable-static' '--with-slurm' '--with-pmix' MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0) MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.0.0) MCA pmix base: --------------------------------------------------- MCA pmix base: parameter "pmix" (current value: "", data source: default, level: 2 user/detail, type: string) Default selection set of components for the pmix framework (<none> means use all components that can be found) MCA pmix base: --------------------------------------------------- MCA pmix base: parameter "pmix_base_verbose" (current value: "error", data source: default, level: 8 dev/detail, type: int) Verbosity level for the pmix framework (default: 0) MCA pmix base: parameter "pmix_base_async_modex" (current value: "false", data source: default, level: 9 dev/all, type: bool) MCA pmix base: parameter "pmix_base_collect_data" (current value: "true", data source: default, level: 9 dev/all, type: bool) MCA pmix base: parameter "pmix_base_exchange_timeout" (current value: "-1", data source: default, level: 3 user/all, type: int) MCA pmix pmix2x: --------------------------------------------------- MCA pmix pmix2x: parameter "pmix_pmix2x_silence_warning" (current value: "false", data source: default, level: 4 tuner/basic, type: bool) But when srun the openmpi, I got error likes ==== $ srun -n 4 ./a.out -------------------------------------------------------------------------- The application appears to have been direct launched using "srun", but OMPI was not built with SLURM's PMI support and therefore cannot execute. There are several options for building PMI support under SLURM, depending upon the SLURM version you are using: version 16.05 or later: you can use SLURM's PMIx support. This requires that you configure and build SLURM --with-pmix. Versions earlier than 16.05: you must use either SLURM's PMI-1 or PMI-2 support. SLURM builds PMI-1 by default, or you can manually install PMI-2. You must then build Open MPI using --with-pmi pointing to the SLURM PMI library location. Please configure as appropriate and try again. -------------------------------------------------------------------------- *** An error occurred in MPI_Init *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! === How can I check if openmpi is built for the PMI support ? Thanks a lot. /Jing
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users