[OMPI users] Spawn problem
Hi, sorry bring this again ... but i hope use spawn in ompi someday :-D The execution of spawn in this way works fine: MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); but if this code go to a for I get a problem : for (i= 0; i < 2; i++) { MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE); } and the error is: spawning ... child! child! [localhost:03892] *** Process received signal *** [localhost:03892] Signal: Segmentation fault (11) [localhost:03892] Signal code: Address not mapped (1) [localhost:03892] Failing at address: 0xc8 [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0] [localhost:03892] [ 1] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3) [0x2ac71ba7448c] [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2ac71b9decdf] [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2ac71ba04765] [localhost:03892] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71) [0x2ac71ba365c9] [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2] [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) [0x2ac71ccb7b74] [localhost:03892] [ 7] ./spawn1 [0x400989] [localhost:03892] *** End of error message *** -- mpirun noticed that process rank 0 with PID 3892 on node localhost exited on signal 11 (Segmentation fault). -- the attachments contain the ompi_info, config.log and program. thanks for some check, Joao. config.log.gz Description: GNU Zip compressed data ompi_info.txt.gz Description: GNU Zip compressed data spawn1.c.gz Description: GNU Zip compressed data
Re: [OMPI users] Spawn problem
My C++ is a little rusty. Is that returned intercommunicator going where you think it is? If you unroll the loop does the same badness happen? On Mon, 2008-03-31 at 02:41 -0300, Joao Vicente Lima wrote: > Hi, > sorry bring this again ... but i hope use spawn in ompi someday :-D > > The execution of spawn in this way works fine: > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, > MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); > > but if this code go to a for I get a problem : > for (i= 0; i < 2; i++) > { > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1, > MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE); > } > > and the error is: > spawning ... > child! > child! > [localhost:03892] *** Process received signal *** > [localhost:03892] Signal: Segmentation fault (11) > [localhost:03892] Signal code: Address not mapped (1) > [localhost:03892] Failing at address: 0xc8 > [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0] > [localhost:03892] [ 1] > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3) > [0x2ac71ba7448c] > [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71b9decdf] > [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71ba04765] > [localhost:03892] [ 4] > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71) > [0x2ac71ba365c9] > [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2] > [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) [0x2ac71ccb7b74] > [localhost:03892] [ 7] ./spawn1 [0x400989] > [localhost:03892] *** End of error message *** > -- > mpirun noticed that process rank 0 with PID 3892 on node localhost > exited on signal 11 (Segmentation fault). > -- > > the attachments contain the ompi_info, config.log and program. > > thanks for some check, > Joao. > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Spawn problem
On 30/03/2008, Joao Vicente Lima wrote: > Hi, > sorry bring this again ... but i hope use spawn in ompi someday :-D I believe it's crashing in MPI_Finalize because you have not closed all communication paths between the parent and the child processes. For the parent process, try calling MPI_Comm_free or MPI_Comm_disconnect on each intercomm in your intercomm array before calling finalize. On the child, call free or disconnect on the parent intercomm before calling finalize. Out of curiosity, why a loop of spawns? Why not increase the value of the maxprocs argument, or if you need to spawn different executables, or use different arguments for each instance, why not MPI_Comm_spawn_multiple? mch > > The execution of spawn in this way works fine: > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, > MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); > > but if this code go to a for I get a problem : > for (i= 0; i < 2; i++) > { > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1, > MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE); > } > > and the error is: > spawning ... > child! > child! > [localhost:03892] *** Process received signal *** > [localhost:03892] Signal: Segmentation fault (11) > [localhost:03892] Signal code: Address not mapped (1) > [localhost:03892] Failing at address: 0xc8 > [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0] > [localhost:03892] [ 1] > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3) > [0x2ac71ba7448c] > [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71b9decdf] > [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71ba04765] > [localhost:03892] [ 4] > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71) > [0x2ac71ba365c9] > [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2] > [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) > [0x2ac71ccb7b74] > [localhost:03892] [ 7] ./spawn1 [0x400989] > [localhost:03892] *** End of error message *** > -- > mpirun noticed that process rank 0 with PID 3892 on node localhost > exited on signal 11 (Segmentation fault). > -- > > the attachments contain the ompi_info, config.log and program. > > thanks for some check, > > Joao. > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > >
Re: [OMPI users] RPM build errors when creating multiple rpms
On Mar 26, 2008, at 10:05 AM, Ashley Pittman wrote: The community Open MPI projects distributes SRPMs which, when built, do not install into /opt by default -- you have to request it specifically. Out of interest how does open-mpi handle the mpir_dll_name symbol in the library, it's supposed to contain the location of the debugger library and therefore does not play well with relocating RPMs or installed library's other to the place they were built for. Today, it does not -- the location has to be compile-time initialized. So if you move the library+plugin somewhere else, the Etnus scheme to find the DLL currently cannot handle it. We have proposed a new scheme to Etnus and Allinea that allows a bit more flexibility to find the DLLs at run-time; both have agreed to the idea in principle. We will include this support in Open MPI v1.3; I don't know when/if the debuggers will support it. I believe that the ball is currently in my court; Etnus asked me some questions to which I have not yet replied... doh... -- Jeff Squyres Cisco Systems
Re: [OMPI users] Spawn problem
Really MPI_Finalize is crashing and calling MPI_Comm_{free,disconnect} works! I don't know if the free/disconnect must appear before a MPI_Finalize for this case (spawn processes) some suggest ? I use loops in spawn: - first for testing :) - and second because certain MPI applications don't know in advance the number of childrens needed to complete his work. The spawn works is creat ... I will made other tests. thanks, Joao On Mon, Mar 31, 2008 at 3:03 AM, Matt Hughes wrote: > On 30/03/2008, Joao Vicente Lima wrote: > > Hi, > > sorry bring this again ... but i hope use spawn in ompi someday :-D > > I believe it's crashing in MPI_Finalize because you have not closed > all communication paths between the parent and the child processes. > For the parent process, try calling MPI_Comm_free or > MPI_Comm_disconnect on each intercomm in your intercomm array before > calling finalize. On the child, call free or disconnect on the parent > intercomm before calling finalize. > > Out of curiosity, why a loop of spawns? Why not increase the value of > the maxprocs argument, or if you need to spawn different executables, > or use different arguments for each instance, why not > MPI_Comm_spawn_multiple? > > mch > > > > > > > > > The execution of spawn in this way works fine: > > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, > > MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); > > > > but if this code go to a for I get a problem : > > for (i= 0; i < 2; i++) > > { > > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1, > > MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE); > > } > > > > and the error is: > > spawning ... > > child! > > child! > > [localhost:03892] *** Process received signal *** > > [localhost:03892] Signal: Segmentation fault (11) > > [localhost:03892] Signal code: Address not mapped (1) > > [localhost:03892] Failing at address: 0xc8 > > [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0] > > [localhost:03892] [ 1] > > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3) > > [0x2ac71ba7448c] > > [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71b9decdf] > > [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71ba04765] > > [localhost:03892] [ 4] > > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71) > > [0x2ac71ba365c9] > > [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2] > > [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) > [0x2ac71ccb7b74] > > [localhost:03892] [ 7] ./spawn1 [0x400989] > > [localhost:03892] *** End of error message *** > > -- > > mpirun noticed that process rank 0 with PID 3892 on node localhost > > exited on signal 11 (Segmentation fault). > > -- > > > > the attachments contain the ompi_info, config.log and program. > > > > thanks for some check, > > > > Joao. > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Spawn problem
Hi again, when I call MPI_Init_thread in the same program the error is: spawning ... opal_mutex_lock(): Resource deadlock avoided [localhost:07566] *** Process received signal *** [localhost:07566] Signal: Aborted (6) [localhost:07566] Signal code: (-6) [localhost:07566] [ 0] /lib/libpthread.so.0 [0x2abe5630ded0] [localhost:07566] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2abe5654c3c5] [localhost:07566] [ 2] /lib/libc.so.6(abort+0x10e) [0x2abe5654d73e] [localhost:07566] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe5528063b] [localhost:07566] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280559] [localhost:07566] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe552805e8] [localhost:07566] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280fff] [localhost:07566] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280f3d] [localhost:07566] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55281f59] [localhost:07566] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+ 0x204) [0x2abe552823cd] [localhost:07566] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2a be58efb5f7] [localhost:07566] [11] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(MPI_Comm_spawn+0x 465) [0x2abe552b55cd] [localhost:07566] [12] ./spawn1(main+0x9d) [0x400b05] [localhost:07566] [13] /lib/libc.so.6(__libc_start_main+0xf4) [0x2abe56539b74] [localhost:07566] [14] ./spawn1 [0x4009d9] [localhost:07566] *** End of error message *** opal_mutex_lock(): Resource deadlock avoided [localhost:07567] *** Process received signal *** [localhost:07567] Signal: Aborted (6) [localhost:07567] Signal code: (-6) [localhost:07567] [ 0] /lib/libpthread.so.0 [0x2b48610f9ed0] [localhost:07567] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2b48613383c5] [localhost:07567] [ 2] /lib/libc.so.6(abort+0x10e) [0x2b486133973e] [localhost:07567] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c63b] [localhost:07567] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c559] [localhost:07567] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c5e8] [localhost:07567] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cfff] [localhost:07567] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cf3d] [localhost:07567] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006df59] [localhost:07567] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+ 0x204) [0x2b486006e3cd] [localhost:07567] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b 4863ce75f7] [localhost:07567] [11] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b 4863ce9c2b] [localhost:07567] [12] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b48600720d7] [localhost:07567] [13] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Init_thread+ 0x166) [0x2b48600ae4f2] [localhost:07567] [14] ./spawn1(main+0x2c) [0x400a94] [localhost:07567] [15] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b4861325b74] [localhost:07567] [16] ./spawn1 [0x4009d9] [localhost:07567] *** End of error message *** -- mpirun noticed that process rank 0 with PID 7566 on node localhost exited on sig nal 6 (Aborted). -- thank for some check, Joao. On Mon, Mar 31, 2008 at 11:49 AM, Joao Vicente Lima wrote: > Really MPI_Finalize is crashing and calling MPI_Comm_{free,disconnect} works! > I don't know if the free/disconnect must appear before a MPI_Finalize > for this case (spawn processes) some suggest ? > > I use loops in spawn: > - first for testing :) > - and second because certain MPI applications don't know in advance > the number of childrens needed to complete his work. > > The spawn works is creat ... I will made other tests. > > thanks, > Joao > > > > On Mon, Mar 31, 2008 at 3:03 AM, Matt Hughes > wrote: > > On 30/03/2008, Joao Vicente Lima wrote: > > > Hi, > > > sorry bring this again ... but i hope use spawn in ompi someday :-D > > > > I believe it's crashing in MPI_Finalize because you have not closed > > all communication paths between the parent and the child processes. > > For the parent process, try calling MPI_Comm_free or > > MPI_Comm_disconnect on each intercomm in your intercomm array before > > calling finalize. On the child, call free or disconnect on the parent > > intercomm before calling finalize. > > > > Out of curiosity, why a loop of spawns? Why not increase the value of > > the maxprocs argument, or if you need to spawn different executables, > > or use different arguments for each instance, why not > > MPI_Comm_spawn_multiple? > > > > mch > > > > > > > > > > > > > > > > The execution of spawn in this way works fine: > > > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, > > > MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); > > > > > > but if this code go to a for I get a problem : > > > for (i= 0; i < 2; i++) >