I'm afraid the two solvers would be in the same comm_world if launched that way
Sent from my iPhone > On Nov 27, 2013, at 11:58 AM, Gus Correa <g...@ldeo.columbia.edu> wrote: > > Hi Ola, Ralph > > I may be wrong, but I'd guess launching the two solvers > in MPMD/MIMD mode would work smoothly with the torque PBS_NODEFILE, > in a *single* Torque job. > If I understood Ola right, that is what he wants. > > Say, something like this (for one 32-core node): > > #PBS -l nodes=1:ppn=32 > ... > mpiexec -np 8 ./solver1 : -np 24 ./solver2 > > I am assuming the two executables never talk to each other, right? > They solve the same problem with different methods, in a completely > independent and "embarrassingly parallel" fashion, and could run > concurrently. > > Is that right? > Or did I misunderstand Ola's description, and they work in a staggered > sequence to each other? > [first s1, then s2, then s1 again, then s2 once more...] > I am a bit confused by Ola's use of the words "loosely coupled" in his > description, which might indicate cooperation to solve the same problem, > rather than independent work on two instances of the same problem. > > Ralph: Does the MPI model assume that MPMD/MIMD executables > have to necessarily communicate with each other, > or perhaps share a common MPI_COMM_WORLD? > [I guess not.] > > Anyway, just a guess, > Gus Correa > >> On 11/27/2013 10:23 AM, Ralph Castain wrote: >> Are you wanting to run the solvers on different nodes within the >> allocation? Or on different cores across all nodes? >> >> For different nodes, you can just use -host to specify which nodes you >> want that specific mpirun to use, or a hostfile should also be fine. The >> FAQ's comment was aimed at people who were giving us the PBS_NODEFILE as >> the hostfile - which could confuse older versions of OMPI into using the >> rsh launcher instead of Torque. Remember that we have the relative node >> syntax so you don't actually have to name the nodes - helps if you want >> to execute batch scripts and won't know the node names in advance. >> >> For different cores across all nodes, you would need to use some binding >> trickery that may not be in the 1.4 series, so you might need to update >> to the 1.6 series. You have two options: (a) have Torque bind your >> mpirun to specific cores (I believe it can do that), or (b) use >> --slot-list to specify which cores that particular mpirun is to use. You >> can then separate the two solvers but still run on all the nodes, if >> that is of concern. >> >> HTH >> Ralph >> >> >> >> On Wed, Nov 27, 2013 at 6:10 AM, <ola.widl...@se.abb.com >> <mailto:ola.widl...@se.abb.com>> wrote: >> >> Hi, >> >> We have an in-house application where we run two solvers in a >> loosely coupled manner: The first solver runs a timestep, then the >> second solver does work on the same timestep, etc. As the two >> solvers never execute at the same time, we would like to run the two >> solvers in the same allocation (launching mpirun once for each of >> them). RAM is not an issue, so there should not be any risk of >> excessive swapping degrading performance. >> >> We use openmpi-1.4.5 compiled with torque integration. The torque >> integration means we do not give a hostfile to mpirun, it will >> itself query torque for the allocation info. >> >> Question: >> >> Can we force one of the solvers to run in a *subset* of the full >> allocation? How do we do that? I read in the FAQ that providing a >> hostfile to mpirun in this case (when it's not needed due to torque >> integration) would cause a lot of problems... >> >> Thanks in advance, >> >> Ola >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users