Re: [OMPI users] Node assignment using openmpi for multiple simulations in the same submission script in PBS (GROMACS)

Jeff Squyres Tue, 6 Nov 2007 22:11:28 -0500

On Nov 2, 2007, at 11:02 AM, himanshu khandelia wrote:

This question is about the use of a simulation package called GROMACS.

PS: On our cluster (quad-core nodes), GROMACS does not scale well
beyond 4 cpus. So, I wish to run two different simulations, while
requesting 2 nodes (1 simulation on each node) to best exploit the
policies of our MAUI scheduler.

So, I am requesting 2 4-cpu nodes on a cluster using PBS. I want to
run a separate simulation on each 4-cpu node. However, on 2 nodes, the
speed for each simulation decreases (50 to 100%) if compared to a
simulation which runs in a job which requests only one node. I am
guessing this is because openmpi fails to assign all cpus of the same
node to one simulation ? Instead, cpus from different nodes are being
used to run simulation. This is what I have in the PBS script

1.
########
mpirun -np 4  my-gromacs-executable-for-simulation-1 -np 4  &
mpirun -np 4  my-gromacs-executable-for-simulation-1-np 4  &

In this case, Open MPI does not realize that you have executed 2mpiruns and therefore assigns both the first and second job to thesame 4 processors. Hence, they run at half speed (or slower) becausethey're competing for the same CPUs.

# (THE GROMACS EXECUTABLE DOES REQUIRE A REPEAT REQUEST FOR THE NUMBER
OF PROCESSORS)

wait
########

OPENMPI does have a mechanism whereby one can assign specific
processes to specific nodes

http://www.open-mpi.org/faq/?category=running#mpirun-scheduling

So, I also tried all of the following in the PBS script where the
--bynode or the --byslot option is used

2.
########
mpirun -np 4  --bynode my-gromacs-executable-for-simulation-1  -np 4 &
mpirun -np 4  --bynode my-gromacs-executable-for-simulation-2  -np 4 &
wait
########

3.
########
mpirun -np 4  --byslot  my-gromacs-executable-for-simulation-1 -np 4 &
mpirun -np 4  --byslot  my-gromacs-executable-for-simulation-2 -np 4 &
wait
########

But these methods also result in similar performance losses.

The same thing will happen here -- OMPI is unaware of the 2 mpiruns,and therefore schedules on the same first 4 nodes or slots.

It sounds like you really want to run two torque jobs, not one. Isthere a reason you're not doing that?

Failing that, you could do an end-run around the Open MPI torquesupport and use the rsh launcher to precisely control where you launchjobs, but that's a bunch of trouble and not really how we intended thesystem to be used.


So how does one assign the cpus properly using mpirun if running
different simulations in the same PBS job  ??

Thank you for the help,

-Himanshu
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

Re: [OMPI users] Node assignment using openmpi for multiple simulations in the same submission script in PBS (GROMACS)

Reply via email to