I don't suppose you could upgrade to 4.0.2, could you? We just released 4.0.2 with a ton of good bug fixes.
On Oct 15, 2019, at 2:07 PM, Eric F. Alemany via users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote: Hi, I am using OpenMPI-4.0.1 on a single ubuntu 18.04 server with 64 cores. I compiled an “hello.c” file with: mpicc hello.c -o openmpi_hello When I run mpirun -np 64 openmpi_hello I got the following error. [radoncjonsnow:08747] *** Process received signal *** [radoncjonsnow:08747] Signal: Segmentation fault (11) [radoncjonsnow:08747] Signal code: Address not mapped (1) [radoncjonsnow:08747] Failing at address: 0x7f7b001b0008 [radoncjonsnow:08747] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f7aff89ef20] [radoncjonsnow:08747] [ 1] /usr/local/.openmpi/lib/pmix/mca_gds_ds21.so(pmix_gds_ds21_lock_init+0x12a)[0x7f7af79d214a] [radoncjonsnow:08747] [ 2] /usr/local/.openmpi/lib/libmca_common_dstore.so.1(pmix_common_dstor_init+0x86b)[0x7f7af73b1b9b] [radoncjonsnow:08747] [ 3] /usr/local/.openmpi/lib/pmix/mca_gds_ds21.so(+0x1dd4)[0x7f7af79d1dd4] [radoncjonsnow:08747] [ 4] /usr/local/.openmpi/lib/openmpi/mca_pmix_pmix3x.so(OPAL_MCA_PMIX3X_pmix_gds_base_select+0x118)[0x7f7afc9f5ee8] [radoncjonsnow:08747] [ 5] /usr/local/.openmpi/lib/openmpi/mca_pmix_pmix3x.so(OPAL_MCA_PMIX3X_pmix_rte_init+0x813)[0x7f7afc9b0763] [radoncjonsnow:08747] [ 6] /usr/local/.openmpi/lib/openmpi/mca_pmix_pmix3x.so(OPAL_MCA_PMIX3X_PMIx_Init+0x198)[0x7f7afc96e868] [radoncjonsnow:08747] [ 7] /usr/local/.openmpi/lib/openmpi/mca_pmix_pmix3x.so(pmix3x_client_init+0xcb)[0x7f7afc91870b] [radoncjonsnow:08747] [ 8] /usr/local/.openmpi/lib/openmpi/mca_ess_pmi.so(+0x1a56)[0x7f7afd441a56] [radoncjonsnow:08747] [ 9] /usr/local/.openmpi/lib/libopen-rte.so.40(orte_init+0x291)[0x7f7aff5ba411] [radoncjonsnow:08747] [10] /usr/local/.openmpi/lib/libmpi.so.40(ompi_mpi_init+0x29c)[0x7f7affca79cc] [radoncjonsnow:08747] [11] /usr/local/.openmpi/lib/libmpi.so.40(MPI_Init+0x6e)[0x7f7affcd81ae] [radoncjonsnow:08747] [12] openmpi_hello(+0x88b)[0x55d6a7c1c88b] [radoncjonsnow:08747] [13] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f7aff881b97] [radoncjonsnow:08747] [14] openmpi_hello(+0x77a)[0x55d6a7c1c77a] [radoncjonsnow:08747] *** End of error message *** That same error is repeated about 32 times. At the end it says -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that process rank 5 with PID 0 on node radoncjonsnow exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Do you know what I am doing wrong? Thank you all for your help. Best, Eric ____________________________________________________________________________________________________________________________ Eric F. Alemany Systems Administrator for Research EXO Extended Operations Stanford Medicine - Technology & Digital Services Stanford, California 94305 -- Jeff Squyres jsquy...@cisco.com<mailto:jsquy...@cisco.com>