Edgar, here is the F77 version of the source code. Thanks
Sergio -----Original Message----- From: Edgar Gabriel [mailto:gabr...@cs.uh.edu] Sent: Tuesday, March 07, 2006 12:09 PM To: Open MPI Users Subject: Re: [OMPI users] Spawn and Disconnect I know that there was a bug in the F90 interface of spawn-multiple, however (which is fixed by now as far as I can tell). Could you send me the f77 example which you have? The concatination problem looks strange, I would like to have a look at it... Thanks Edgar Brignone, Sergio wrote: > Thanks Edgar, Ralph and Jean. > > It seems to me that the problem I am having is related to the operating > system or MPI configuration or compiler or all of them (I am using > Solaris). > > For example, the F90 as well as the C++ interfaces could not be compiled > (I had to configure MPI without them). > > I converted Jean's example to F77 and tested. It didn't work (off > course, you can always claim that I didn't convert them right ...); in > fact it seems I got errors in the Fortran to C conversion of strings > (the program fils1 exists but notice the error: it concatenates all > strings. This looks to me that the F to C conversion is not correct). > So I am assuming that the problems are related to my particular > environment. > > I will debug and see what the problem is. > > Thanks for your help. > > Sergio Brignone > > > > bash-2.03$ perem > PR : rank = 0 size = 1 > PR : I am running on PE 0 > PR : I am before the spawning of fils1 on PE 1 > ------------------------------------------------------------------------ > -- > Could not execute the executable "./fils1 ./fils2 ./fils3 ./fils4 ": No > such file or directory > > This could mean that your PATH or executable name is wrong, or that you > do not > have the necessary permissions. Please ensure that the executable is > able to be > found and executed. > > ------------------------------------------------------------------------ > -- > > > > -----Original Message----- > From: Jean Latour [mailto:lat...@fujitsu.fr] > Sent: Friday, March 03, 2006 1:50 AM > To: r...@lanl.gov; Open MPI Users > Subject: Re: [OMPI users] Spawn and Disconnect > > Just to add an example that may help to this "disconnect" discussion : > Attached is the code of a test that does the following (and it works > perfectly with OpenMPI 1.0.1) > > 1) master spawns slave1 > 2) master spawns slave2 > 3) exechange messages between master and slaves over intercommunicator > 4) slave1 disconnects from master and finalize > 5) slave2 disconnects from master and finalize > (the processors used by slave 1 and slave 2 can now be re-used by new > spawned processes) > 6) master spawns slave3, and then slave4 > 7) slave3 and slave4 have NO direct communicator, but they can create > one through the Open-Port > mechanism and the MPI_Connect / MPI_Accept functions. > The port number is relayed through the master. > 8) slave3 and slave4 create this direct communicator and do some > pingpong over it > 9) slave3 and slave4 disconnect from each other on this direct > communicator > 10) slave3 and slave4 disconnect from master an finalize > 11) master finalize > > Hope it helps > Best regards, > Jean Latour > > Ralph Castain wrote: > > >>We expect to have much better support for the entire comm_spawn >>process in the next incarnation of the RTE. I don't expect that to be >>included in a release, however, until 1.1 (Jeff may be able to give >>you an estimate for when that will happen). >> >>Jeff et al may be able to give you access to an early non-release >>version sooner, if better comm_spawn support is a critical issue and >>you don't mind being patient with the inevitable bugs in such > > versions. > >>Ralph >> >> >>Edgar Gabriel wrote: >> >> >>>Open MPI currently does not fully support a proper disconnection of >>>parent and child processes. Thus, if a child dies/aborts, the parents >>>will abort as well, despite of calling MPI_Comm_disconnect. (The new > > RTE > >>>will have better support for these operations, Ralph/Jeff can probably > > >>>give a better estimate when this will be available.) >>> >>>However, what should not happen is, that if the child calls > > MPI_Finalize > >>>(so not a violent death but a proper shutdown), the parent goes down > > at > >>>the same time. Let me check that as well... >>> >>>Brignone, Sergio wrote: >>> >>> >>> >>> >>>>Hi everybody, >>>> >>>> >>>> >>>>I am trying to run a master/slave set. >>>> >>>>Because of the nature of the problem I need to start and stop (kill) >>>>some slaves. >>>> >>>>The problem is that as soon as one of the slave dies, the master dies > > also. > >>>> >>>> >>>>This is what I am doing: >>>> >>>> >>>> >>>>MASTER: >>>> >>>> >>>> >>>>MPI_Init(...) >>>> >>>> >>>> >>>>MPI_Comm_spawn(slave1,...,nslave1,...,intercomm1); >>>>MPI_Barrier(intercomm1); >>>>MPI_Comm_disconnect(&intercomm1); >>>>MPI_Comm_spawn(slave2,...,nslave2,...,intercomm2); >>>>MPI_Barrier(intercomm2); >>>>MPI_Comm_disconnect(&intercomm2); >>>>MPI_Finalize(); >>>>SLAVE: >>>>MPI_Init(...) >>>>MPI_Comm_get_parent(&intercomm); >>>>(does something) >>>>MPI_Barrier(intercomm); >>>>MPI_Comm_disconnect(&intercomm); >>>>MPI_Finalize(); >>>>The issue is that as soon as the first set of slaves calls > > MPI_Finalize, > >>>>the master dies also (it dies right after > > MPI_Comm_disconnect(&intercomm1) ) > >>>>What am I doing wrong? >>>>Thanks >>>>Sergio
spawn_issues.tar.gz
Description: spawn_issues.tar.gz