> The first time I did not notice that 16 cpus are twice as slow as 8. > Are you really sure you did not mix things up? > The other way around the timings would make perfect sense. > If not, there is a problem with your 16 cpu simulation. > > What load imbalance is reported for the 8 cpu run? > > Berk >
Sorry, recently so sleepy, so I mixed up with the 8cpus and 16 cpus in the last email. no doubt that the 16 cpus fast. Seems 8 cpus no problems. D O M A I N D E C O M P O S I T I O N S T A T I S T I C S av. #atoms communicated per step for force: 2 x 70598.6 av. #atoms communicated per step for LINCS: 2 x 952.6 Average load imbalance: 1500.0 % Part of the total run time spent waiting due to load imbalance: 187.5 % Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 0 % NOTE: 187.5 % performance was lost due to load imbalance in the domain decomposition. R E A L C Y C L E A N D T I M E A C C O U N T I N G Computing: Nodes Number G-Cycles Seconds % ----------------------------------------------------------------------- Domain decomp. 16 1000000 121878.350 36533.7 4.4 Comm. coord. 16 5000001 247646.114 74233.2 8.9 Neighbor search 16 1000001 1002091.741 300382.2 35.9 Force 16 5000001 919401.942 275595.5 32.9 Wait + Comm. F 16 5000001 299888.017 89893.0 10.7 Write traj. 16 2001 249.523 74.8 0.0 Update 16 5000001 20258.672 6072.6 0.7 Constraints 16 5000001 105721.492 31690.6 3.8 Comm. energies 16 5000001 51653.353 15483.4 1.9 Rest 16 22395.593 6713.2 0.8 ----------------------------------------------------------------------- Total 16 2791184.797 836672.0 100.0 ----------------------------------------------------------------------- Parallel run - timing based on wallclock. NODE (s) Real (s) (%) Time: 52292.000 52292.000 100.0 14h31:32 (Mnbf/s) (GFlops) (ns/day) (hour/ns) Performance: 523.244 19.720 16.523 1.453 Finished mdrun on node 0 Tue Apr 6 05:09:47 2010 \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\(below is 8cpu's md.log) D O M A I N D E C O M P O S I T I O N S T A T I S T I C S av. #atoms communicated per step for force: 2 x 46783.6 av. #atoms communicated per step for LINCS: 2 x 664.8 Average load imbalance: 9.8 % Part of the total run time spent waiting due to load imbalance: 3.5 % R E A L C Y C L E A N D T I M E A C C O U N T I N G Computing: Nodes Number G-Cycles Seconds % ----------------------------------------------------------------------- Domain decomp. 8 1000000 69839.474 20934.6 2.7 Comm. coord. 8 5000001 162704.625 48771.3 6.3 Neighbor search 8 1000001 982919.098 294633.3 38.2 Force 8 5000001 917463.617 275012.8 35.6 Wait + Comm. F 8 5000001 273548.449 81997.1 10.6 Update 8 5000001 20465.372 6134.6 0.8 Constraints 8 5000001 96414.222 28900.5 3.7 Comm. energies 8 5000001 28884.158 8658.1 1.1 Rest 8 9223372036.855 2764736802.2 358286.2 ----------------------------------------------------------------------- Total 8 2574303.046 771656.0 100.0 ----------------------------------------------------------------------- Parallel run - timing based on wallclock. NODE (s) Real (s) (%) Time: 96457.000 96457.000 100.0 1d02h47:37 (Mnbf/s) (GFlops) (ns/day) (hour/ns) Performance: 283.696 10.701 8.957 2.679 Finished mdrun on node 0 Mon Apr 5 01:36:18 2010 Thanks and regards, lina -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php