Chao Zhang wrote:
Dear GMX-Users,

I'm testing my 256 full hydrated lipid on blue gene. The purpose is to find out the right 
number for "-npme", as mdrun can not estimate itself successfully.

I met the problem that how to match the maxinum allowed number for DD cells 
with large number of CPU cores.

My simulation box size  is about 8x8x9nm^3, with normal LINCS parameter and 
dds=0.8. The log file said that the maximum allowed number of DD cells is 8x8x9.

Why are you setting -dds? -rcon and -rdd are also variables to play with... but if you get too "close to the bone" you can run into (e.g.) LINCS problems. A lipid-water system has inhomogeneous interaction density, so the DD load-balance needs to scale the starting guess for cells, and -dds will have a significant effect at high parallelization. See mdrun -h, and the manual and GROMACS 4 paper.

As far as I understand, DD assigns one core to one cell, so the maximum core I 
can use in this case for PP part is 8x8x9=576 cores.

I then ran with 512 cores with -npme=128. My system runs without problem.

What if I want to use more cores?

Then I try to increase the "-dds" from 0.8 to 0.9, this leads to an increasing of "the maximum allowed number of DD cells" to 8x10x10.
This time is 1024 cores in total and I set -npme=224, then PP part will have 
800 cores which are within 10x10x10.

The system ran initially but corrupted very soon with warning that "DD cell 2 1 4 
could on obtain 56 of the 57 atoms that are connected via constraints from the 
neighboring cells ...."

Therefore the dilemma is if I increase the "-dds", I can meet the requirement 
for the maximum allowed number of DD cells, but fail the maximum length of constraints in 
LINCS.

Does it mean that for a relative small system, it is not possible to using up 
to thousand of cores by domain decomposition?

Yes. Each cell takes responsibility for a subset of atoms, and then communicates them to neighbouring cell who need to know. As the cell gets smaller, the communication cost would get larger. GROMACS sets a number of semi-artificial constraints on the cell size with the above options. There is a lower limit on DD cell size in practice for a given system with the GROMACS 4 implementation, but you have to experiment to find it. Whether you derive any speed advantage from moving towards that limit will depend on the relative performance of your processors and network.

IBM's Blue Matter MD code is supposed to work down at around 1 atom per core, but GROMACS isn't built to do that.

I know that if it makes more sense to use thousand of cores for huge system, 
but if my purpose is simply to speed up the simulation, what should I do?

If your objective is increasing effective sampling, REMD of 16 replicas of 64 cores (or similar) makes much sense.

Mark
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Reply via email to