Hi, Sorry, guys. I don't think the newbie here can follow any discussion beyond basic mpi...
Anyway, if I add the pair call MPI_COMM_GET_PARENT(mpi_comm_parent,ierror) call MPI_COMM_DISCONNECT(mpi_comm_parent,ierror) on the spawnee side I get the proper response in the spawning processes. Please, take a look at the attached toy codes parent.F and child.F I've been playing with. 'mpirun -n 2 parent' seems to work as expected. Alex 2014-12-13 23:46 GMT-02:00 Gilles Gouaillardet < gilles.gouaillar...@gmail.com>: > > Alex, > > Are you calling MPI_Comm_disconnect in the 3 "master" tasks and with the > same remote communicator ? > > I also read the man page again, and MPI_Comm_disconnect does not ensure > the remote processes have finished or called MPI_Comm_disconnect, so that > might not be the thing you need. > George, can you please comment on that ? > > Cheers, > > Gilles > > George Bosilca <bosi...@icl.utk.edu> wrote: > MPI_Comm_disconnect should be a local operation, there is no reason for it > to deadlock. I looked at the code and everything is local with the > exception of a call to PMIX.FENCE. Can you attach to your deadlocked > processes and confirm that they are stopped in the pmix.fence? > > George. > > > On Sat, Dec 13, 2014 at 8:47 AM, Alex A. Schmidt <a...@ufsm.br> wrote: > >> Hi >> >> Sorry, I was calling mpi_comm_disconnect on the group comm handler, not >> on the intercomm handler returned from the spawn call as it should be. >> >> Well, calling the disconnect on the intercomm handler does halt the >> spwaner >> side but the wait is never completed since, as George points out, there >> is no >> disconnect call being made on the spawnee side.... and that brings me back >> to the beginning of the problem since, being a third party app, that call >> would >> never be there. I guess an mpi wrapper to deal with that could be made for >> the app, but I fell the wrapper itself, at the end, would face the same >> problem >> we face right now. >> >> My application is a genetic algorithm code that search optimal >> configuration >> (minimum or maximum energy) of cluster of atoms. The work flow bottleneck >> is the calculation of the cluster energy. For the cases which an >> analytical >> potential is available the calculation can be made internally and the >> workload >> is distributed among slaves nodes from a master node. This is also done >> when an analytical potential is not available and the energy calculation >> must >> be done externally by a quantum chemistry code like dftb+, siesta and >> Gaussian. >> So far, we have been running these codes in serial mode. No need to say >> that >> we could do a lot better if they could be executed in parallel. >> >> I am not familiar with DMRAA but it seems to be the right choice to deal >> with >> job schedulers as it covers the ones I am interested in (pbs/torque and >> loadlever). >> >> Alex >> >> 2014-12-13 7:49 GMT-02:00 Gilles Gouaillardet < >> gilles.gouaillar...@gmail.com>: >>> >>> George is right about the semantic >>> >>> However i am surprised it returns immediatly... >>> That should either work or hang imho >>> >>> The second point is no more mpi related, and is batch manager specific. >>> >>> You will likely find a submit parameter to make the command block until >>> the job completes. Or you can write your own wrapper. >>> Or you can retrieve the jobid and qstat periodically to get the job >>> state. >>> If an api is available, this is also an option. >>> >>> Cheers, >>> >>> Gilles >>> >>> George Bosilca <bosi...@icl.utk.edu> wrote: >>> You have to call MPI_Comm_disconnect on both sides of the >>> intercommunicator. On the spawner processes you should call it on the >>> intercom, while on the spawnees you should call it on the >>> MPI_Comm_get_parent. >>> >>> George. >>> >>> On Dec 12, 2014, at 20:43 , Alex A. Schmidt <a...@ufsm.br> wrote: >>> >>> Gilles, >>> >>> MPI_comm_disconnect seem to work but not quite. >>> The call to it returns almost immediatly while >>> the spawn processes keep piling up in the background >>> until they are all done... >>> >>> I think system('env -i qsub...') to launch the third party apps >>> would take the execution of every call back to the scheduler >>> queue. How would I track each one for their completion? >>> >>> Alex >>> >>> 2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet < >>> gilles.gouaillar...@gmail.com>: >>>> >>>> Alex, >>>> >>>> You need MPI_Comm_disconnect at least. >>>> I am not sure if this is 100% correct nor working. >>>> >>>> If you are using third party apps, why dont you do something like >>>> system("env -i qsub ...") >>>> with the right options to make qsub blocking or you manually wait for >>>> the end of the job ? >>>> >>>> That looks like a much cleaner and simpler approach to me. >>>> >>>> Cheers, >>>> >>>> Gilles >>>> >>>> "Alex A. Schmidt" <a...@ufsm.br> wrote: >>>> Hello Gilles, >>>> >>>> Ok, I believe I have a simple toy app running as I think it should: >>>> 'n' parent processes running under mpi_comm_world, each one >>>> spawning its own 'm' child processes (each child group work >>>> together nicely, returning the expected result for an mpi_allreduce >>>> call). >>>> >>>> Now, as I mentioned before, the apps I want to run in the spawned >>>> processes are third party mpi apps and I don't think it will be >>>> possible >>>> to exchange messages with them from my app. So, I do I tell >>>> when the spawned processes have finnished running? All I have to work >>>> with is the intercommunicator returned from the mpi_comm_spawn call... >>>> >>>> Alex >>>> >>>> >>>> >>>> >>>> 2014-12-12 2:42 GMT-02:00 Alex A. Schmidt <a...@ufsm.br>: >>>>> >>>>> Gilles, >>>>> >>>>> Well, yes, I guess.... >>>>> >>>>> I'll do tests with the real third party apps and let you know. >>>>> These are huge quantum chemistry codes (dftb+, siesta and Gaussian) >>>>> which greatly benefits from a parallel environment. My code is just >>>>> a front end to use those, but since we have a lot of data to process >>>>> it also benefits from a parallel environment. >>>>> >>>>> Alex >>>>> >>>>> >>>>> 2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet < >>>>> gilles.gouaillar...@iferc.org>: >>>>>> >>>>>> Alex, >>>>>> >>>>>> just to make sure ... >>>>>> this is the behavior you expected, right ? >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Gilles >>>>>> >>>>>> >>>>>> On 2014/12/12 13:27, Alex A. Schmidt wrote: >>>>>> >>>>>> Gilles, >>>>>> >>>>>> Ok, very nice! >>>>>> >>>>>> When I excute >>>>>> >>>>>> do rank=1,3 >>>>>> call MPI_Comm_spawn('hello_world',' >>>>>> ',5,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status) >>>>>> enddo >>>>>> >>>>>> I do get 15 instances of the 'hello_world' app running: 5 for each parent >>>>>> rank 1, 2 and 3. >>>>>> >>>>>> Thanks a lot, Gilles. >>>>>> >>>>>> Best regargs, >>>>>> >>>>>> Alex >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> 2014-12-12 1:32 GMT-02:00 Gilles Gouaillardet >>>>>> <gilles.gouaillar...@iferc.org >>>>>> >>>>>> : >>>>>> >>>>>> Alex, >>>>>> >>>>>> just ask MPI_Comm_spawn to start (up to) 5 tasks via the maxprocs >>>>>> parameter : >>>>>> >>>>>> int MPI_Comm_spawn(char *command, char *argv[], int maxprocs, >>>>>> MPI_Info info, >>>>>> int root, MPI_Comm comm, MPI_Comm *intercomm, >>>>>> int array_of_errcodes[]) >>>>>> >>>>>> INPUT PARAMETERS >>>>>> maxprocs >>>>>> - maximum number of processes to start (integer, >>>>>> significant >>>>>> only at root) >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Gilles >>>>>> >>>>>> >>>>>> On 2014/12/12 12:23, Alex A. Schmidt wrote: >>>>>> >>>>>> Hello Gilles, >>>>>> >>>>>> Thanks for your reply. The "env -i PATH=..." stuff seems to work!!! >>>>>> >>>>>> call system("sh -c 'env -i PATH=/usr/lib64/openmpi/bin:/bin mpirun -n 2 >>>>>> hello_world' ") >>>>>> >>>>>> did produce the expected result with a simple openmi "hello_world" code I >>>>>> wrote. >>>>>> >>>>>> I might be harder though with the real third party app I have in mind. >>>>>> And >>>>>> I realize >>>>>> getting passed over a job scheduler with this approach might not work at >>>>>> all... >>>>>> >>>>>> I have looked at the MPI_Comm_spawn call but I failed to understand how >>>>>> it >>>>>> could help here. For instance, can I use it to launch an mpi app with the >>>>>> option "-n 5" ? >>>>>> >>>>>> Alex >>>>>> >>>>>> 2014-12-12 0:36 GMT-02:00 Gilles Gouaillardet >>>>>> <gilles.gouaillar...@iferc.org >>>>>> >>>>>> >>>>>> : >>>>>> >>>>>> Alex, >>>>>> >>>>>> can you try something like >>>>>> call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name') >>>>>> >>>>>> -i start with an empty environment >>>>>> that being said, you might need to set a few environment variables >>>>>> manually : >>>>>> env -i PATH=/bin ... >>>>>> >>>>>> and that being also said, this "trick" could be just a bad idea : >>>>>> you might be using a scheduler, and if you empty the environment, the >>>>>> scheduler >>>>>> will not be aware of the "inside" run. >>>>>> >>>>>> on top of that, invoking system might fail depending on the interconnect >>>>>> you use. >>>>>> >>>>>> Bottom line, i believe Ralph's reply is still valid, even if five years >>>>>> have passed : >>>>>> changing your workflow, or using MPI_Comm_spawn is a much better >>>>>> approach. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Gilles >>>>>> >>>>>> On 2014/12/12 11:22, Alex A. Schmidt wrote: >>>>>> >>>>>> Dear OpenMPI users, >>>>>> >>>>>> Regarding to this previous >>>>>> post<http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> from >>>>>> 2009, >>>>>> I wonder if the reply >>>>>> from Ralph Castain is still valid. My need is similar but quite simpler: >>>>>> to make a system call from an openmpi fortran application to run a >>>>>> third party openmpi application. I don't need to exchange mpi messages >>>>>> with the application. I just need to read the resulting output file >>>>>> generated >>>>>> by it. I have tried to do the following system call from my fortran >>>>>> openmpi >>>>>> code: >>>>>> >>>>>> call system("sh -c 'mpirun -n 2 app_name") >>>>>> >>>>>> but I get >>>>>> >>>>>> ********************************************************** >>>>>> >>>>>> Open MPI does not support recursive calls of mpirun >>>>>> >>>>>> ********************************************************** >>>>>> >>>>>> Is there a way to make this work? >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Alex >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing listus...@open-mpi.org >>>>>> >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25966.php >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing listus...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this >>>>>> post:http://www.open-mpi.org/community/lists/users/2014/12/25967.php >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing listus...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25968.php >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing listus...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this >>>>>> post:http://www.open-mpi.org/community/lists/users/2014/12/25969.php >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing listus...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25970.php >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25971.php >>>>>> >>>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2014/12/25974.php >>>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/12/25975.php >>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/12/25978.php >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/12/25979.php >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/25981.php >
program parent implicit none include "mpif.h" integer ii,kk integer mpi_size,mpi_rank integer ierror,group_world integer subgroup(0:10) integer subgroup_comm(0:10) integer subgroup_intercomm(0:10) call MPI_INIT(ierror) call MPI_COMM_SIZE(mpi_comm_world,mpi_size,ierror) call MPI_COMM_RANK(mpi_comm_world,mpi_rank,ierror) c get world group handler call mpi_comm_group(mpi_comm_world,group_world,ierror) c create a single member subgroup (containing itself) for each rank do ii=0,mpi_size-1 call MPI_GROUP_INCL(group_world,1,ii,subgroup(ii),ierror) enddo c creates the comm handlers for each subgroup do ii=0,mpi_size-1 call MPI_COMM_CREATE(mpi_comm_world,subgroup(ii),kk,ierror) if (ii.eq.mpi_rank) subgroup_comm(ii) = kk enddo c now do some spawning: each rank spwans 5 'child' processes 4 times c root is always '0' in the spawn call since each subgroup has a single member do ii=1,4 call MPI_Comm_spawn('child',' ',5,MPI_INFO_NULL,0, + subgroup_comm(mpi_rank),subgroup_intercomm(mpi_rank), + MPI_ERRCODES_IGNORE,ierror) call MPI_Comm_disconnect(subgroup_intercomm(mpi_rank),ierror) enddo call MPI_FINALIZE(ierror) end
program child implicit none include "mpif.h" integer ierror,mpi_size,mpi_rank integer mpi_init_error,mpi_comm_parent call MPI_INIT(mpi_init_error) call MPI_COMM_SIZE(MPI_COMM_WORLD,mpi_size,ierror) call MPI_COMM_RANK(MPI_COMM_WORLD,mpi_rank,ierror) call MPI_COMM_GET_PARENT(mpi_comm_parent,ierror) print *,'hello world from ',mpi_rank,mpi_size call MPI_COMM_DISCONNECT(mpi_comm_parent,ierror) call MPI_FINALIZE(ierror) end