Hi Chris, thank for the suggestions, in the previous mail there is a mistake because couple-moltype = SOL (for solvent) and not "Protein_chaim_P". Now the problem of the load balance seems reasonable, because the water box is large ~9.0 nm. However the problem exist and the performance loss is very high, so I have redone calculations with this command:
grompp -f md.mdp -c ../Run-02/confout.gro -t ../Run-02/state.cpt -p ../topo.top -n ../index.ndx -o md.tpr -maxwarn 1 mdrun -s md.tpr -o md this is part of the md.mdp file: ; Run parameters ; define = -DPOSRES integrator = md ; nsteps = 1000 ; dt = 0.002 ; [..] free_energy = yes ; /no init_lambda = 0.9 delta_lambda = 0.0 couple-moltype = SOL ; solvent water couple-lambda0 = vdw-q couple-lambda1 = none couple-intramol= yes Result for free energy calculation Computing: Nodes Number G-Cycles Seconds % ----------------------------------------------------------------------- Domain decomp. 8 126 22.050 8.3 0.1 DD comm. load 8 15 0.009 0.0 0.0 DD comm. bounds 8 12 0.031 0.0 0.0 Comm. coord. 8 1001 17.319 6.5 0.0 Neighbor search 8 127 436.569 163.7 1.1 Force 8 1001 34241.576 12840.9 87.8 Wait + Comm. F 8 1001 19.486 7.3 0.0 PME mesh 8 1001 4190.758 1571.6 10.7 Write traj. 8 7 1.827 0.7 0.0 Update 8 1001 12.557 4.7 0.0 Constraints 8 1001 26.496 9.9 0.1 Comm. energies 8 1002 10.710 4.0 0.0 Rest 8 25.142 9.4 0.1 ----------------------------------------------------------------------- Total 8 39004.531 14627.1 100.0 ----------------------------------------------------------------------- ----------------------------------------------------------------------- PME redist. X/F 8 3003 3479.771 1304.9 8.9 PME spread/gather 8 4004 277.574 104.1 0.7 PME 3D-FFT 8 4004 378.090 141.8 1.0 PME solve 8 2002 55.033 20.6 0.1 ----------------------------------------------------------------------- Parallel run - timing based on wallclock. NODE (s) Real (s) (%) Time: 1828.385 1828.385 100.0 30:28 (Mnbf/s) (GFlops) (ns/day) (hour/ns) Performance: 3.115 3.223 0.095 253.689 I Switched off only the free_energy keyword and I redone the calculation I have: Computing: Nodes Number G-Cycles Seconds % ----------------------------------------------------------------------- Domain decomp. 8 77 10.975 4.1 0.6 DD comm. load 8 1 0.001 0.0 0.0 Comm. coord. 8 1001 14.480 5.4 0.8 Neighbor search 8 78 136.479 51.2 7.3 Force 8 1001 1141.115 427.9 61.3 Wait + Comm. F 8 1001 17.845 6.7 1.0 PME mesh 8 1001 484.581 181.7 26.0 Write traj. 8 5 1.221 0.5 0.1 Update 8 1001 9.976 3.7 0.5 Constraints 8 1001 20.275 7.6 1.1 Comm. energies 8 992 5.933 2.2 0.3 Rest 8 19.670 7.4 1.1 ----------------------------------------------------------------------- Total 8 1862.552 698.5 100.0 ----------------------------------------------------------------------- ----------------------------------------------------------------------- PME redist. X/F 8 2002 92.204 34.6 5.0 PME spread/gather 8 2002 192.337 72.1 10.3 PME 3D-FFT 8 2002 177.373 66.5 9.5 PME solve 8 1001 22.512 8.4 1.2 ----------------------------------------------------------------------- Parallel run - timing based on wallclock. NODE (s) Real (s) (%) Time: 87.309 87.309 100.0 1:27 (Mnbf/s) (GFlops) (ns/day) (hour/ns) Performance: 439.731 23.995 1.981 12.114 Finished mdrun on node 0 Mon Apr 4 16:52:04 2011 Luca > If we accept your text at face value, then the simulation slowed down > by a factor of 1500%, certainly not the 16% of the load balancing. > > Please let us know what version of gromacs and cut and paste your > cammands that you used to run gromacs (so we can verify that you ran > on the same number of processors) and cut and paste a diff of the .mdp > files (so that we can verify that you ran for the same number of steps). > > You might be correct about the slowdown, but let's rule out some other > more obvious problems first. > > Chris. > > -- original message -- > > > Dear all, > when I run a single free energy simulation > i noticed that there is a loss of performace with respect to > the normal MD > > free_energy = yes > init_lambda = 0.9 > delta_lambda = 0.0 > couple-moltype = Protein_Chain_P > couple-lambda0 = vdw-q > couple-lambda0 = none > couple-intramol= yes > > Average load imbalance: 16.3 % > Part of the total run time spent waiting due to load imbalance: 12.2 % > Steps where the load balancing was limited by -rdd, -rcon and/or -dds: > X0 % Time: 1852.712 1852.712 100.0 > > free_energy = no > Average load imbalance: 2.7 % > Part of the total run time spent waiting due to load imbalance: 1.7 % > Time: 127.394 127.394 100.0 > > It seems that the loss of performace is due in part to in the load > imbalance in the domain decomposition, however I tried to change > these keywords without benefit > Any comment is welcome. > > Thanks -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists