I cannot replicate the problem - both scenarios work fine for me. I’m not convinced your test code is correct, however, as you call Comm_free the inter-communicator but didn’t call Comm_disconnect. Checkout the attached for a correct code and see if it works for you.
FWIW: I don’t know how many cores you have on your sockets, but if you have 6 cores/socket, then your slot-list is equivalent to “—bind-to none” as the slot-list applies to every process being launched
simple_spawn.c
Description: Binary data
> On May 23, 2016, at 6:26 AM, Siegmar Gross > <siegmar.gr...@informatik.hs-fulda.de> wrote: > > Hi, > > I installed openmpi-1.10.3rc2 on my "SUSE Linux Enterprise Server > 12 (x86_64)" with Sun C 5.13 and gcc-6.1.0. Unfortunately I get > a segmentation fault for "--slot-list" for one of my small programs. > > > loki spawn 119 ompi_info | grep -e "OPAL repo revision:" -e "C compiler > absolute:" > OPAL repo revision: v1.10.2-201-gd23dda8 > C compiler absolute: /usr/local/gcc-6.1.0/bin/gcc > > > loki spawn 120 mpiexec -np 1 --host loki,loki,loki,loki,loki spawn_master > > Parent process 0 running on loki > I create 4 slave processes > > Parent process 0: tasks in MPI_COMM_WORLD: 1 > tasks in COMM_CHILD_PROCESSES local group: 1 > tasks in COMM_CHILD_PROCESSES remote group: 4 > > Slave process 0 of 4 running on loki > Slave process 1 of 4 running on loki > Slave process 2 of 4 running on loki > spawn_slave 2: argv[0]: spawn_slave > Slave process 3 of 4 running on loki > spawn_slave 0: argv[0]: spawn_slave > spawn_slave 1: argv[0]: spawn_slave > spawn_slave 3: argv[0]: spawn_slave > > > > > loki spawn 121 mpiexec -np 1 --host loki --slot-list 0:0-5,1:0-5 spawn_master > > Parent process 0 running on loki > I create 4 slave processes > > [loki:17326] *** Process received signal *** > [loki:17326] Signal: Segmentation fault (11) > [loki:17326] Signal code: Address not mapped (1) > [loki:17326] Failing at address: 0x8 > [loki:17326] [ 0] /lib64/libpthread.so.0(+0xf870)[0x7f4e469b3870] > [loki:17326] [ 1] *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > [loki:17324] Local abort before MPI_INIT completed successfully; not able to > aggregate error messages, and not able to guarantee that all other processes > were killed! > /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_proc_self+0x35)[0x7f4e46c165b0] > [loki:17326] [ 2] > /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_comm_init+0x68b)[0x7f4e46bf5b08] > [loki:17326] [ 3] *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > [loki:17325] Local abort before MPI_INIT completed successfully; not able to > aggregate error messages, and not able to guarantee that all other processes > were killed! > /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_mpi_init+0xa90)[0x7f4e46c1be8a] > [loki:17326] [ 4] > /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(MPI_Init+0x180)[0x7f4e46c5828e] > [loki:17326] [ 5] spawn_slave[0x40097e] > [loki:17326] [ 6] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f4e4661db05] > [loki:17326] [ 7] spawn_slave[0x400a54] > [loki:17326] *** End of error message *** > ------------------------------------------------------- > Child job 2 terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[56340,2],0] > Exit code: 1 > -------------------------------------------------------------------------- > loki spawn 122 > > > > > I would be grateful, if somebody can fix the problem. Thank you > very much for any help in advance. > > > Kind regards > > Siegmar > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/05/29281.php