Hi, This question is about the use of a simulation package called GROMACS.
PS: On our cluster (quad-core nodes), GROMACS does not scale well beyond 4 cpus. So, I wish to run two different simulations, while requesting 2 nodes (1 simulation on each node) to best exploit the policies of our MAUI scheduler. So, I am requesting 2 4-cpu nodes on a cluster using PBS. I want to run a separate simulation on each 4-cpu node. However, on 2 nodes, the speed for each simulation decreases (50 to 100%) if compared to a simulation which runs in a job which requests only one node. I am guessing this is because openmpi fails to assign all cpus of the same node to one simulation ? Instead, cpus from different nodes are being used to run simulation. This is what I have in the PBS script 1. ######## mpirun -np 4 my-gromacs-executable-for-simulation-1 -np 4 & mpirun -np 4 my-gromacs-executable-for-simulation-1-np 4 & # (THE GROMACS EXECUTABLE DOES REQUIRE A REPEAT REQUEST FOR THE NUMBER OF PROCESSORS) wait ######## OPENMPI does have a mechanism whereby one can assign specific processes to specific nodes http://www.open-mpi.org/faq/?category=running#mpirun-scheduling So, I also tried all of the following in the PBS script where the --bynode or the --byslot option is used 2. ######## mpirun -np 4 --bynode my-gromacs-executable-for-simulation-1 -np 4 & mpirun -np 4 --bynode my-gromacs-executable-for-simulation-2 -np 4 & wait ######## 3. ######## mpirun -np 4 --byslot my-gromacs-executable-for-simulation-1 -np 4 & mpirun -np 4 --byslot my-gromacs-executable-for-simulation-2 -np 4 & wait ######## But these methods also result in similar performance losses. So how does one assign the cpus properly using mpirun if running different simulations in the same PBS job ?? Thank you for the help, -Himanshu