Re: [gmx-users] Problems with REMD in Gromacs 4.6.3

gigo Fri, 12 Jul 2013 17:24:58 -0700

On 2013-07-12 20:00, Mark Abraham wrote:

On Fri, Jul 12, 2013 at 4:27 PM, gigo <g...@ibb.waw.pl> wrote:

Hi!


On 2013-07-12 11:15, Mark Abraham wrote:


What does --loadbalance do?



It balances the total number of processes across all allocated nodes.

OK, but using it means you are hostage to its assumptions aboutbalance.

Thats true, but as long as I do not try to use more resources that thetorque gives me, everything is OK. The question is, what is a proper wayof running multiple simulations in parallel with MPI that are furtherparallelized with OpenMP, when pinning fails? I could not find anyother.

The
thing is that mpiexec does not know that I want each replica to forkto 4OpenMP threads. Thus, without this option and without affinities (ina secabout it) mpiexec starts too many replicas on some nodes - gromacscomplainsabout the overload then - while some cores on other nodes are notused. It
is possible to run my simulation like that:

mpiexec mdrun_mpi -v -cpt 20 -multi 144 -replex 2000 -cpi (without
--loadbalance for mpiexec and without -ntomp for mdrun)
Then each replica runs on 4 MPI processes (I allocate 4 times morecoresthen replicas and mdrun sees it). The problem is that it is muchslower than
using OpenMP for each replica. I did not find any other way than
--loadbalance in mpiexec and then -multi 144 -ntomp 4 in mdrun to useMPI
and OpenMP at the same time on the torque-controlled cluster.
That seems highly surprising. I have not yet encountered a job
scheduler that was completely lacking a "do what I tell you" layout
scheme. More importantly, why are you using #PBS -l nodes=48:ppn=12?

I thing that torque is very similar to all PBS-like resource managersin this regard. It actually does what I tell it to do. There are 12-corenodes, I ask for 48 of them - I get them (simple #PBS -l ncpus=576 doesnot work), end of story. Now, the program that I run is responsible forpopulating resources that I got.

Surely you want 3 MPI processes per 12-core node?

Yes - I want each node to run 3 MPI processes. Preferably, I would liketo run each MPI process on separate node (spread on 12 cores withOpenMP) but I will not get as much of resources. But again, without the--loadbalance hack I would not be able to properly populate the nodes...

What do the .log files say about
OMP_NUM_THREADS, thread affinities, pinning, etc?
Each replica logs:
"Using 1 MPI process
Using 4 OpenMP threads",
That is is correct. As I said, the threads are forked, but 3 out of 4don't
do anything, and the simulation does not go at all.

About affinities Gromacs says:
"Can not set thread affinities on the current platform. On NUMAsystems thiscan cause performance degradation. If you think your platform shouldsupport
setting affinities, contact the GROMACS developers."

Well, the "current platform" is normal x86_64 cluster, but the whole
information about resources is passed by Torque to OpenMPI-linkedGromacs.Can it be that mdrun sees the resources allocated by torque as a bigpool of
cpus and misses the information about nodes topology?
mdrun gets its processor topology from the MPI layer, so that is where
you need to focus. The error message confirms that GROMACS sees things
that seem wrong.

Thank you, I will take a look. But the first thing I want to do isfinding the reason why Gromacs 4.6.3 is not able to run on my (slightlyweird, I admit) setup, while 4.6.2 does it very well.

Best,

Grzegorz
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!

* Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] Problems with REMD in Gromacs 4.6.3

Reply via email to