Ralph, I guess you mean "call mpi_comm_spawn( 'siesta', '< infile' , 2 ,...)"
to execute 'mpirun -n 2 siesta < infile' on the spawnee side. That was my first choice. Well, siesta behaves as if no stdin file was present... Alex 2014-12-15 17:07 GMT-02:00 Ralph Castain <r...@open-mpi.org>: > > You should be able to just include that in your argv that you pass to the > Comm_spawn API. > > > On Mon, Dec 15, 2014 at 9:27 AM, Alex A. Schmidt <a...@ufsm.br> wrote: > >> George, >> >> Thanks for the tip. In fact, calling mpi_comm_spawn right away with MPI >> _COMM_SELF >> has worked for me just as well -- no subgroups needed at all. >> >> I am testing this openmpi app named "siesta" in parallel. The source >> code is available, >> so making it "spawn ready" by adding the pair mpi_comm_get_parent + >> mpi_comm_disconnect >> >> into the main code can be done. If it works, maybe the siesta's >> developers can be convinced >> to add this feature in a future release. >> >> However, siesta is launched only by specifying input/output files with >> i/o redirection like >> >> mpirun -n <*some number*> siesta < infile > outfile >> >> So far, I could not find anything about how to set an stdin file for an >> spawnee process. >> Specifiyng it in a app context file doesn't seem to work. Can it be done? >> Maybe through >> an MCA parameter? >> >> Alex >> >> >> >> >> >> 2014-12-15 2:43 GMT-02:00 George Bosilca <bosi...@icl.utk.edu>: >>> >>> Alex, >>> >>> The code looks good, and is 100% MPI standard accurate. >>> >>> I would change the way you create the subcoms in the parent. You do a >>> lot of useless operations, as you can achieve exactly the same outcome (one >>> communicator per node), either by duplicating MPI_COMM_SELF or doing an >>> MPI_Comm_split with the color equal to your rank. >>> >>> George. >>> >>> >>> On Sun, Dec 14, 2014 at 2:20 AM, Alex A. Schmidt <a...@ufsm.br> wrote: >>> >>>> Hi, >>>> >>>> Sorry, guys. I don't think the newbie here can follow any discussion >>>> beyond basic mpi... >>>> >>>> Anyway, if I add the pair >>>> >>>> call MPI_COMM_GET_PARENT(mpi_comm_parent,ierror) >>>> call MPI_COMM_DISCONNECT(mpi_comm_parent,ierror) >>>> >>>> on the spawnee side I get the proper response in the spawning processes. >>>> >>>> Please, take a look at the attached toy codes parent.F and child.F >>>> I've been playing with. 'mpirun -n 2 parent' seems to work as expected. >>>> >>>> Alex >>>> >>>> 2014-12-13 23:46 GMT-02:00 Gilles Gouaillardet < >>>> gilles.gouaillar...@gmail.com>: >>>>> >>>>> Alex, >>>>> >>>>> Are you calling MPI_Comm_disconnect in the 3 "master" tasks and with >>>>> the same remote communicator ? >>>>> >>>>> I also read the man page again, and MPI_Comm_disconnect does not >>>>> ensure the remote processes have finished or called MPI_Comm_disconnect, >>>>> so >>>>> that might not be the thing you need. >>>>> George, can you please comment on that ? >>>>> >>>>> Cheers, >>>>> >>>>> Gilles >>>>> >>>>> George Bosilca <bosi...@icl.utk.edu> wrote: >>>>> MPI_Comm_disconnect should be a local operation, there is no reason >>>>> for it to deadlock. I looked at the code and everything is local with the >>>>> exception of a call to PMIX.FENCE. Can you attach to your deadlocked >>>>> processes and confirm that they are stopped in the pmix.fence? >>>>> >>>>> George. >>>>> >>>>> >>>>> On Sat, Dec 13, 2014 at 8:47 AM, Alex A. Schmidt <a...@ufsm.br> wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> Sorry, I was calling mpi_comm_disconnect on the group comm handler, >>>>>> not >>>>>> on the intercomm handler returned from the spawn call as it should be. >>>>>> >>>>>> Well, calling the disconnect on the intercomm handler does halt the >>>>>> spwaner >>>>>> side but the wait is never completed since, as George points out, >>>>>> there is no >>>>>> disconnect call being made on the spawnee side.... and that brings me >>>>>> back >>>>>> to the beginning of the problem since, being a third party app, that >>>>>> call would >>>>>> never be there. I guess an mpi wrapper to deal with that could be >>>>>> made for >>>>>> the app, but I fell the wrapper itself, at the end, would face the >>>>>> same problem >>>>>> we face right now. >>>>>> >>>>>> My application is a genetic algorithm code that search optimal >>>>>> configuration >>>>>> (minimum or maximum energy) of cluster of atoms. The work flow >>>>>> bottleneck >>>>>> is the calculation of the cluster energy. For the cases which an >>>>>> analytical >>>>>> potential is available the calculation can be made internally and the >>>>>> workload >>>>>> is distributed among slaves nodes from a master node. This is also >>>>>> done >>>>>> when an analytical potential is not available and the energy >>>>>> calculation must >>>>>> be done externally by a quantum chemistry code like dftb+, siesta and >>>>>> Gaussian. >>>>>> So far, we have been running these codes in serial mode. No need to >>>>>> say that >>>>>> we could do a lot better if they could be executed in parallel. >>>>>> >>>>>> I am not familiar with DMRAA but it seems to be the right choice to >>>>>> deal with >>>>>> job schedulers as it covers the ones I am interested in (pbs/torque >>>>>> and loadlever). >>>>>> >>>>>> Alex >>>>>> >>>>>> 2014-12-13 7:49 GMT-02:00 Gilles Gouaillardet < >>>>>> gilles.gouaillar...@gmail.com>: >>>>>>> >>>>>>> George is right about the semantic >>>>>>> >>>>>>> However i am surprised it returns immediatly... >>>>>>> That should either work or hang imho >>>>>>> >>>>>>> The second point is no more mpi related, and is batch manager >>>>>>> specific. >>>>>>> >>>>>>> You will likely find a submit parameter to make the command block >>>>>>> until the job completes. Or you can write your own wrapper. >>>>>>> Or you can retrieve the jobid and qstat periodically to get the job >>>>>>> state. >>>>>>> If an api is available, this is also an option. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Gilles >>>>>>> >>>>>>> George Bosilca <bosi...@icl.utk.edu> wrote: >>>>>>> You have to call MPI_Comm_disconnect on both sides of the >>>>>>> intercommunicator. On the spawner processes you should call it on the >>>>>>> intercom, while on the spawnees you should call it on the >>>>>>> MPI_Comm_get_parent. >>>>>>> >>>>>>> George. >>>>>>> >>>>>>> On Dec 12, 2014, at 20:43 , Alex A. Schmidt <a...@ufsm.br> wrote: >>>>>>> >>>>>>> Gilles, >>>>>>> >>>>>>> MPI_comm_disconnect seem to work but not quite. >>>>>>> The call to it returns almost immediatly while >>>>>>> the spawn processes keep piling up in the background >>>>>>> until they are all done... >>>>>>> >>>>>>> I think system('env -i qsub...') to launch the third party apps >>>>>>> would take the execution of every call back to the scheduler >>>>>>> queue. How would I track each one for their completion? >>>>>>> >>>>>>> Alex >>>>>>> >>>>>>> 2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet < >>>>>>> gilles.gouaillar...@gmail.com>: >>>>>>>> >>>>>>>> Alex, >>>>>>>> >>>>>>>> You need MPI_Comm_disconnect at least. >>>>>>>> I am not sure if this is 100% correct nor working. >>>>>>>> >>>>>>>> If you are using third party apps, why dont you do something like >>>>>>>> system("env -i qsub ...") >>>>>>>> with the right options to make qsub blocking or you manually wait >>>>>>>> for the end of the job ? >>>>>>>> >>>>>>>> That looks like a much cleaner and simpler approach to me. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> Gilles >>>>>>>> >>>>>>>> "Alex A. Schmidt" <a...@ufsm.br> wrote: >>>>>>>> Hello Gilles, >>>>>>>> >>>>>>>> Ok, I believe I have a simple toy app running as I think it should: >>>>>>>> 'n' parent processes running under mpi_comm_world, each one >>>>>>>> spawning its own 'm' child processes (each child group work >>>>>>>> together nicely, returning the expected result for an mpi_allreduce >>>>>>>> call). >>>>>>>> >>>>>>>> Now, as I mentioned before, the apps I want to run in the spawned >>>>>>>> processes are third party mpi apps and I don't think it will be >>>>>>>> possible >>>>>>>> to exchange messages with them from my app. So, I do I tell >>>>>>>> when the spawned processes have finnished running? All I have to >>>>>>>> work >>>>>>>> with is the intercommunicator returned from the mpi_comm_spawn >>>>>>>> call... >>>>>>>> >>>>>>>> Alex >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2014-12-12 2:42 GMT-02:00 Alex A. Schmidt <a...@ufsm.br>: >>>>>>>>> >>>>>>>>> Gilles, >>>>>>>>> >>>>>>>>> Well, yes, I guess.... >>>>>>>>> >>>>>>>>> I'll do tests with the real third party apps and let you know. >>>>>>>>> These are huge quantum chemistry codes (dftb+, siesta and Gaussian) >>>>>>>>> which greatly benefits from a parallel environment. My code is just >>>>>>>>> a front end to use those, but since we have a lot of data to >>>>>>>>> process >>>>>>>>> it also benefits from a parallel environment. >>>>>>>>> >>>>>>>>> Alex >>>>>>>>> >>>>>>>>> >>>>>>>>> 2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet < >>>>>>>>> gilles.gouaillar...@iferc.org>: >>>>>>>>>> >>>>>>>>>> Alex, >>>>>>>>>> >>>>>>>>>> just to make sure ... >>>>>>>>>> this is the behavior you expected, right ? >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Gilles >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2014/12/12 13:27, Alex A. Schmidt wrote: >>>>>>>>>> >>>>>>>>>> Gilles, >>>>>>>>>> >>>>>>>>>> Ok, very nice! >>>>>>>>>> >>>>>>>>>> When I excute >>>>>>>>>> >>>>>>>>>> do rank=1,3 >>>>>>>>>> call MPI_Comm_spawn('hello_world',' >>>>>>>>>> ',5,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status) >>>>>>>>>> enddo >>>>>>>>>> >>>>>>>>>> I do get 15 instances of the 'hello_world' app running: 5 for each >>>>>>>>>> parent >>>>>>>>>> rank 1, 2 and 3. >>>>>>>>>> >>>>>>>>>> Thanks a lot, Gilles. >>>>>>>>>> >>>>>>>>>> Best regargs, >>>>>>>>>> >>>>>>>>>> Alex >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2014-12-12 1:32 GMT-02:00 Gilles Gouaillardet >>>>>>>>>> <gilles.gouaillar...@iferc.org >>>>>>>>>> >>>>>>>>>> : >>>>>>>>>> >>>>>>>>>> Alex, >>>>>>>>>> >>>>>>>>>> just ask MPI_Comm_spawn to start (up to) 5 tasks via the maxprocs >>>>>>>>>> parameter : >>>>>>>>>> >>>>>>>>>> int MPI_Comm_spawn(char *command, char *argv[], int maxprocs, >>>>>>>>>> MPI_Info info, >>>>>>>>>> int root, MPI_Comm comm, MPI_Comm >>>>>>>>>> *intercomm, >>>>>>>>>> int array_of_errcodes[]) >>>>>>>>>> >>>>>>>>>> INPUT PARAMETERS >>>>>>>>>> maxprocs >>>>>>>>>> - maximum number of processes to start (integer, >>>>>>>>>> significant >>>>>>>>>> only at root) >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Gilles >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2014/12/12 12:23, Alex A. Schmidt wrote: >>>>>>>>>> >>>>>>>>>> Hello Gilles, >>>>>>>>>> >>>>>>>>>> Thanks for your reply. The "env -i PATH=..." stuff seems to work!!! >>>>>>>>>> >>>>>>>>>> call system("sh -c 'env -i PATH=/usr/lib64/openmpi/bin:/bin mpirun >>>>>>>>>> -n 2 >>>>>>>>>> hello_world' ") >>>>>>>>>> >>>>>>>>>> did produce the expected result with a simple openmi "hello_world" >>>>>>>>>> code I >>>>>>>>>> wrote. >>>>>>>>>> >>>>>>>>>> I might be harder though with the real third party app I have in >>>>>>>>>> mind. And >>>>>>>>>> I realize >>>>>>>>>> getting passed over a job scheduler with this approach might not >>>>>>>>>> work at >>>>>>>>>> all... >>>>>>>>>> >>>>>>>>>> I have looked at the MPI_Comm_spawn call but I failed to understand >>>>>>>>>> how it >>>>>>>>>> could help here. For instance, can I use it to launch an mpi app >>>>>>>>>> with the >>>>>>>>>> option "-n 5" ? >>>>>>>>>> >>>>>>>>>> Alex >>>>>>>>>> >>>>>>>>>> 2014-12-12 0:36 GMT-02:00 Gilles Gouaillardet >>>>>>>>>> <gilles.gouaillar...@iferc.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> : >>>>>>>>>> >>>>>>>>>> Alex, >>>>>>>>>> >>>>>>>>>> can you try something like >>>>>>>>>> call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name') >>>>>>>>>> >>>>>>>>>> -i start with an empty environment >>>>>>>>>> that being said, you might need to set a few environment variables >>>>>>>>>> manually : >>>>>>>>>> env -i PATH=/bin ... >>>>>>>>>> >>>>>>>>>> and that being also said, this "trick" could be just a bad idea : >>>>>>>>>> you might be using a scheduler, and if you empty the environment, the >>>>>>>>>> scheduler >>>>>>>>>> will not be aware of the "inside" run. >>>>>>>>>> >>>>>>>>>> on top of that, invoking system might fail depending on the >>>>>>>>>> interconnect >>>>>>>>>> you use. >>>>>>>>>> >>>>>>>>>> Bottom line, i believe Ralph's reply is still valid, even if five >>>>>>>>>> years >>>>>>>>>> have passed : >>>>>>>>>> changing your workflow, or using MPI_Comm_spawn is a much better >>>>>>>>>> approach. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Gilles >>>>>>>>>> >>>>>>>>>> On 2014/12/12 11:22, Alex A. Schmidt wrote: >>>>>>>>>> >>>>>>>>>> Dear OpenMPI users, >>>>>>>>>> >>>>>>>>>> Regarding to this previous >>>>>>>>>> post<http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> >>>>>>>>>> from 2009, >>>>>>>>>> I wonder if the reply >>>>>>>>>> from Ralph Castain is still valid. My need is similar but quite >>>>>>>>>> simpler: >>>>>>>>>> to make a system call from an openmpi fortran application to run a >>>>>>>>>> third party openmpi application. I don't need to exchange mpi >>>>>>>>>> messages >>>>>>>>>> with the application. I just need to read the resulting output file >>>>>>>>>> generated >>>>>>>>>> by it. I have tried to do the following system call from my fortran >>>>>>>>>> openmpi >>>>>>>>>> code: >>>>>>>>>> >>>>>>>>>> call system("sh -c 'mpirun -n 2 app_name") >>>>>>>>>> >>>>>>>>>> but I get >>>>>>>>>> >>>>>>>>>> ********************************************************** >>>>>>>>>> >>>>>>>>>> Open MPI does not support recursive calls of mpirun >>>>>>>>>> >>>>>>>>>> ********************************************************** >>>>>>>>>> >>>>>>>>>> Is there a way to make this work? >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> >>>>>>>>>> Alex >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing listus...@open-mpi.org >>>>>>>>>> >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> Link to this post: >>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25966.php >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing listus...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> Link to this >>>>>>>>>> post:http://www.open-mpi.org/community/lists/users/2014/12/25967.php >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing listus...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> >>>>>>>>>> Link to this post: >>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25968.php >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing listus...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> Link to this >>>>>>>>>> post:http://www.open-mpi.org/community/lists/users/2014/12/25969.php >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing listus...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> Link to this post: >>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25970.php >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing list >>>>>>>>>> us...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> Link to this post: >>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25971.php >>>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>> Link to this post: >>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25974.php >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> Link to this post: >>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25975.php >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> Link to this post: >>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25978.php >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25979.php >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2014/12/25981.php >>>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2014/12/25982.php >>>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/12/25991.php >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/12/26002.php >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/26003.php >