Re: [OMPI users] OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

Ralph Castain Mon, 15 Dec 2014 14:07:21 -0500 (EST)

You should be able to just include that in your argv that you pass to the
Comm_spawn API.



On Mon, Dec 15, 2014 at 9:27 AM, Alex A. Schmidt <a...@ufsm.br> wrote:
>
> George,
>
> Thanks for the tip. In fact, calling mpi_comm_spawn right away with MPI
> _COMM_SELF
> has worked for me just as well -- no subgroups needed at all.
>
> I am testing this openmpi app named "siesta" in parallel. The source code
> is available,
> so making it "spawn ready" by adding the pair mpi_comm_get_parent + 
> mpi_comm_disconnect
>
> into the main code can be done.  If it works, maybe the siesta's
> developers can be convinced
> to add this feature in a future release.
>
> However, siesta is launched only by specifying input/output files with
> i/o redirection like
>
> mpirun -n <*some number*>  siesta < infile > outfile
>
> So far, I could not find anything about how to set an stdin file for an
> spawnee process.
> Specifiyng it in a app context file doesn't seem to work. Can it be done?
> Maybe through
> an MCA parameter?
>
> Alex
>
>
>
>
>
> 2014-12-15 2:43 GMT-02:00 George Bosilca <bosi...@icl.utk.edu>:
>>
>> Alex,
>>
>> The code looks good, and is 100% MPI standard accurate.
>>
>> I would change the way you create the subcoms in the parent. You do a lot
>> of useless operations, as you can achieve exactly the same outcome (one
>> communicator per node), either by duplicating MPI_COMM_SELF or doing an
>> MPI_Comm_split with the color equal to your rank.
>>
>>   George.
>>
>>
>> On Sun, Dec 14, 2014 at 2:20 AM, Alex A. Schmidt <a...@ufsm.br> wrote:
>>
>>> Hi,
>>>
>>> Sorry, guys. I don't think the newbie here can follow any discussion
>>> beyond basic mpi...
>>>
>>> Anyway, if I add the pair
>>>
>>> call MPI_COMM_GET_PARENT(mpi_comm_parent,ierror)
>>> call MPI_COMM_DISCONNECT(mpi_comm_parent,ierror)
>>>
>>> on the spawnee side I get the proper response in the spawning processes.
>>>
>>> Please, take a look at the attached toy codes parent.F and child.F
>>> I've been playing with. 'mpirun -n 2 parent' seems to work as expected.
>>>
>>> Alex
>>>
>>> 2014-12-13 23:46 GMT-02:00 Gilles Gouaillardet <
>>> gilles.gouaillar...@gmail.com>:
>>>>
>>>> Alex,
>>>>
>>>> Are you calling MPI_Comm_disconnect in the 3 "master" tasks and with
>>>> the same remote communicator ?
>>>>
>>>> I also read the man page again, and MPI_Comm_disconnect does not ensure
>>>> the remote processes have finished or called MPI_Comm_disconnect, so that
>>>> might not be the thing you need.
>>>> George, can you please comment on that ?
>>>>
>>>> Cheers,
>>>>
>>>> Gilles
>>>>
>>>> George Bosilca <bosi...@icl.utk.edu> wrote:
>>>> MPI_Comm_disconnect should be a local operation, there is no reason for
>>>> it to deadlock. I looked at the code and everything is local with the
>>>> exception of a call to PMIX.FENCE. Can you attach to your deadlocked
>>>> processes and confirm that they are stopped in the pmix.fence?
>>>>
>>>>   George.
>>>>
>>>>
>>>> On Sat, Dec 13, 2014 at 8:47 AM, Alex A. Schmidt <a...@ufsm.br> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> Sorry, I was calling mpi_comm_disconnect on the group comm handler, not
>>>>> on the intercomm handler returned from the spawn call as it should be.
>>>>>
>>>>> Well, calling the disconnect on the intercomm handler does halt the
>>>>> spwaner
>>>>> side but the wait is never completed since, as George points out,
>>>>> there is no
>>>>> disconnect call being made on the spawnee side.... and that brings me
>>>>> back
>>>>> to the beginning of the problem since, being a third party app, that
>>>>> call would
>>>>> never be there. I guess an mpi wrapper to deal with that could be made
>>>>> for
>>>>> the app, but I fell the wrapper itself, at the end, would face the
>>>>> same problem
>>>>> we face right now.
>>>>>
>>>>> My application is a genetic algorithm code that search optimal
>>>>> configuration
>>>>> (minimum or maximum energy) of cluster of atoms. The work flow
>>>>> bottleneck
>>>>> is the calculation of the cluster energy. For the cases which an
>>>>> analytical
>>>>> potential is available the calculation can be made internally and the
>>>>> workload
>>>>> is distributed among slaves nodes from a master node. This is also done
>>>>> when an analytical potential is not available and the energy
>>>>> calculation must
>>>>> be done externally by a quantum chemistry code like dftb+, siesta and
>>>>> Gaussian.
>>>>> So far, we have been running these codes in serial mode. No need to
>>>>> say that
>>>>> we could do a lot better if they could be executed in parallel.
>>>>>
>>>>> I am not familiar with DMRAA but it seems to be the right choice to
>>>>> deal with
>>>>> job schedulers as it covers the ones I am interested in (pbs/torque
>>>>> and loadlever).
>>>>>
>>>>> Alex
>>>>>
>>>>> 2014-12-13 7:49 GMT-02:00 Gilles Gouaillardet <
>>>>> gilles.gouaillar...@gmail.com>:
>>>>>>
>>>>>> George is right about the semantic
>>>>>>
>>>>>> However i am surprised it returns immediatly...
>>>>>> That should either work or hang imho
>>>>>>
>>>>>> The second point is no more mpi related, and is batch manager
>>>>>> specific.
>>>>>>
>>>>>> You will likely find a submit parameter to make the command block
>>>>>> until the job completes. Or you can write your own wrapper.
>>>>>> Or you can retrieve the jobid and qstat periodically to get the job
>>>>>> state.
>>>>>> If an api is available, this is also an option.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Gilles
>>>>>>
>>>>>> George Bosilca <bosi...@icl.utk.edu> wrote:
>>>>>> You have to call MPI_Comm_disconnect on both sides of the
>>>>>> intercommunicator. On the spawner processes you should call it on the
>>>>>> intercom, while on the spawnees you should call it on the
>>>>>> MPI_Comm_get_parent.
>>>>>>
>>>>>>   George.
>>>>>>
>>>>>> On Dec 12, 2014, at 20:43 , Alex A. Schmidt <a...@ufsm.br> wrote:
>>>>>>
>>>>>> Gilles,
>>>>>>
>>>>>> MPI_comm_disconnect seem to work but not quite.
>>>>>> The call to it returns almost immediatly while
>>>>>> the spawn processes keep piling up in the background
>>>>>> until they are all done...
>>>>>>
>>>>>> I think system('env -i qsub...') to launch the third party apps
>>>>>> would take the execution of every call back to the scheduler
>>>>>> queue. How would I track each one for their completion?
>>>>>>
>>>>>> Alex
>>>>>>
>>>>>> 2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet <
>>>>>> gilles.gouaillar...@gmail.com>:
>>>>>>>
>>>>>>> Alex,
>>>>>>>
>>>>>>> You need MPI_Comm_disconnect at least.
>>>>>>> I am not sure if this is 100% correct nor working.
>>>>>>>
>>>>>>> If you are using third party apps, why dont you do something like
>>>>>>> system("env -i qsub ...")
>>>>>>> with the right options to make qsub blocking or you manually wait
>>>>>>> for the end of the job ?
>>>>>>>
>>>>>>> That looks like a much cleaner and simpler approach to me.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Gilles
>>>>>>>
>>>>>>> "Alex A. Schmidt" <a...@ufsm.br> wrote:
>>>>>>> Hello Gilles,
>>>>>>>
>>>>>>> Ok, I believe I have a simple toy app running as I think it should:
>>>>>>> 'n' parent processes running under mpi_comm_world, each one
>>>>>>> spawning its own 'm' child processes (each child group work
>>>>>>> together nicely, returning the expected result for an mpi_allreduce
>>>>>>> call).
>>>>>>>
>>>>>>> Now, as I mentioned before, the apps I want to run in the spawned
>>>>>>> processes are third party mpi apps and I don't think it will be
>>>>>>> possible
>>>>>>> to exchange messages with them from my app. So, I do I tell
>>>>>>> when the spawned processes have finnished running? All I have to work
>>>>>>> with is the intercommunicator returned from the mpi_comm_spawn
>>>>>>> call...
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2014-12-12 2:42 GMT-02:00 Alex A. Schmidt <a...@ufsm.br>:
>>>>>>>>
>>>>>>>> Gilles,
>>>>>>>>
>>>>>>>> Well, yes, I guess....
>>>>>>>>
>>>>>>>> I'll do tests with the real third party apps and let you know.
>>>>>>>> These are huge quantum chemistry codes (dftb+, siesta and Gaussian)
>>>>>>>> which greatly benefits from a parallel environment. My code is just
>>>>>>>> a front end to use those, but since we have a lot of data to process
>>>>>>>> it also benefits from a parallel environment.
>>>>>>>>
>>>>>>>> Alex
>>>>>>>>
>>>>>>>>
>>>>>>>> 2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet <
>>>>>>>> gilles.gouaillar...@iferc.org>:
>>>>>>>>>
>>>>>>>>>  Alex,
>>>>>>>>>
>>>>>>>>> just to make sure ...
>>>>>>>>> this is the behavior you expected, right ?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Gilles
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2014/12/12 13:27, Alex A. Schmidt wrote:
>>>>>>>>>
>>>>>>>>> Gilles,
>>>>>>>>>
>>>>>>>>> Ok, very nice!
>>>>>>>>>
>>>>>>>>> When I excute
>>>>>>>>>
>>>>>>>>> do rank=1,3
>>>>>>>>>     call  MPI_Comm_spawn('hello_world','
>>>>>>>>> ',5,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status)
>>>>>>>>> enddo
>>>>>>>>>
>>>>>>>>> I do get 15 instances of the 'hello_world' app running: 5 for each 
>>>>>>>>> parent
>>>>>>>>> rank 1, 2 and 3.
>>>>>>>>>
>>>>>>>>> Thanks a lot, Gilles.
>>>>>>>>>
>>>>>>>>> Best regargs,
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2014-12-12 1:32 GMT-02:00 Gilles Gouaillardet 
>>>>>>>>> <gilles.gouaillar...@iferc.org
>>>>>>>>>
>>>>>>>>>  :
>>>>>>>>>
>>>>>>>>>  Alex,
>>>>>>>>>
>>>>>>>>> just ask MPI_Comm_spawn to start (up to) 5 tasks via the maxprocs
>>>>>>>>> parameter :
>>>>>>>>>
>>>>>>>>>        int MPI_Comm_spawn(char *command, char *argv[], int maxprocs,
>>>>>>>>> MPI_Info info,
>>>>>>>>>                          int root, MPI_Comm comm, MPI_Comm *intercomm,
>>>>>>>>>                          int array_of_errcodes[])
>>>>>>>>>
>>>>>>>>> INPUT PARAMETERS
>>>>>>>>>        maxprocs
>>>>>>>>>               - maximum number of processes to start (integer, 
>>>>>>>>> significant
>>>>>>>>> only at root)
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Gilles
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2014/12/12 12:23, Alex A. Schmidt wrote:
>>>>>>>>>
>>>>>>>>> Hello Gilles,
>>>>>>>>>
>>>>>>>>> Thanks for your reply. The "env -i PATH=..." stuff seems to work!!!
>>>>>>>>>
>>>>>>>>> call system("sh -c 'env -i PATH=/usr/lib64/openmpi/bin:/bin mpirun -n 
>>>>>>>>> 2
>>>>>>>>> hello_world' ")
>>>>>>>>>
>>>>>>>>> did produce the expected result with a simple openmi "hello_world" 
>>>>>>>>> code I
>>>>>>>>> wrote.
>>>>>>>>>
>>>>>>>>> I might be harder though with the real third party app I have in 
>>>>>>>>> mind. And
>>>>>>>>> I realize
>>>>>>>>> getting passed over a job scheduler with this approach might not work 
>>>>>>>>> at
>>>>>>>>> all...
>>>>>>>>>
>>>>>>>>> I have looked at the MPI_Comm_spawn call but I failed to understand 
>>>>>>>>> how it
>>>>>>>>> could help here. For instance, can I use it to launch an mpi app with 
>>>>>>>>> the
>>>>>>>>> option "-n 5" ?
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>> 2014-12-12 0:36 GMT-02:00 Gilles Gouaillardet 
>>>>>>>>> <gilles.gouaillar...@iferc.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  :
>>>>>>>>>
>>>>>>>>>  Alex,
>>>>>>>>>
>>>>>>>>> can you try something like
>>>>>>>>> call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name')
>>>>>>>>>
>>>>>>>>> -i start with an empty environment
>>>>>>>>> that being said, you might need to set a few environment variables
>>>>>>>>> manually :
>>>>>>>>> env -i PATH=/bin ...
>>>>>>>>>
>>>>>>>>> and that being also said, this "trick" could be just a bad idea :
>>>>>>>>> you might be using a scheduler, and if you empty the environment, the
>>>>>>>>> scheduler
>>>>>>>>> will not be aware of the "inside" run.
>>>>>>>>>
>>>>>>>>> on top of that, invoking system might fail depending on the 
>>>>>>>>> interconnect
>>>>>>>>> you use.
>>>>>>>>>
>>>>>>>>> Bottom line, i believe Ralph's reply is still valid, even if five 
>>>>>>>>> years
>>>>>>>>> have passed :
>>>>>>>>> changing your workflow, or using MPI_Comm_spawn is a much better 
>>>>>>>>> approach.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Gilles
>>>>>>>>>
>>>>>>>>> On 2014/12/12 11:22, Alex A. Schmidt wrote:
>>>>>>>>>
>>>>>>>>> Dear OpenMPI users,
>>>>>>>>>
>>>>>>>>> Regarding to this previous 
>>>>>>>>> post<http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>>>>>>>>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> from 
>>>>>>>>> 2009,
>>>>>>>>> I wonder if the reply
>>>>>>>>> from Ralph Castain is still valid. My need is similar but quite 
>>>>>>>>> simpler:
>>>>>>>>> to make a system call from an openmpi fortran application to run a
>>>>>>>>> third party openmpi application. I don't need to exchange mpi messages
>>>>>>>>> with the application. I just need to read the resulting output file
>>>>>>>>> generated
>>>>>>>>> by it. I have tried to do the following system call from my fortran 
>>>>>>>>> openmpi
>>>>>>>>> code:
>>>>>>>>>
>>>>>>>>> call system("sh -c 'mpirun -n 2 app_name")
>>>>>>>>>
>>>>>>>>> but I get
>>>>>>>>>
>>>>>>>>> **********************************************************
>>>>>>>>>
>>>>>>>>> Open MPI does not support recursive calls of mpirun
>>>>>>>>>
>>>>>>>>> **********************************************************
>>>>>>>>>
>>>>>>>>> Is there a way to make this work?
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing listus...@open-mpi.org
>>>>>>>>>
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this post: 
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25966.php
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing listus...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this 
>>>>>>>>> post:http://www.open-mpi.org/community/lists/users/2014/12/25967.php
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing listus...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>> Link to this post: 
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25968.php
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing listus...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this 
>>>>>>>>> post:http://www.open-mpi.org/community/lists/users/2014/12/25969.php
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing listus...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this post: 
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25970.php
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this post:
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25971.php
>>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25974.php
>>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25975.php
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25978.php
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2014/12/25979.php
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2014/12/25981.php
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2014/12/25982.php
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/12/25991.php
>>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/12/26002.php
>

Re: [OMPI users] OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

Reply via email to