Ralph Castain wrote: > Bottom line for users: the results remain the same. If no other process wants time, you'll continue to see near 100% utilization even if we yield because we will always poll for some time before deciding to yield.
Not surprisingly, I am seeing this with recv/send too, at least when nothing else is running. This is true even though all workers are on different nodes (so no need for shared memory connection between them). Is there a tool in openmpi that will reveal how much "spin time" the processes are using? The previous version of the program I'm currently working on used PVM, and for that implementation gstat, top, etc. provided a good idea of the percent activity on the compute nodes. Not so here. At the moment our cluster is heterogeneous with 3 nodes about 3X faster than the other 20. Because of a lack of load balancing (that's what I am trying to address now) the fast nodes must be idle around 60% of the time, since they will finish their task long before the other nodes, but I can't see it, can you? Here are the relevant columns from one gstat reading, the idle values jump around between machines with no apparent pattern. The 3 faster ones are 02, 05, and 15, but no way to tell that from this data: [ User, Nice, System, Idle, Wio] 01 [ 49.7, 0.0, 50.3, 0.0, 0.0] 02 [ 41.4, 0.0, 58.6, 0.0, 0.0] 03 [ 43.2, 0.0, 49.7, 7.0, 0.0] 04 [ 38.8, 0.0, 46.0, 15.2, 0.0] 05 [ 38.6, 0.0, 46.4, 15.0, 0.0] 06 [ 48.3, 0.0, 51.7, 0.0, 0.0] 07 [ 38.5, 0.0, 46.6, 14.9, 0.0] 08 [ 43.8, 0.0, 51.3, 4.8, 0.0] 09 [ 44.9, 0.0, 48.8, 6.3, 0.0] 10 [ 48.9, 0.0, 49.1, 2.0, 0.0] 11 [ 50.7, 0.0, 49.3, 0.0, 0.0] 12 [ 46.8, 0.0, 53.2, 0.0, 0.0] 13 [ 48.4, 0.0, 51.6, 0.0, 0.0] 14 [ 44.2, 0.0, 48.2, 7.6, 0.0] 15 [ 43.3, 0.0, 56.7, 0.0, 0.0] 16 [ 44.7, 0.0, 50.3, 5.0, 0.0] 17 [ 42.8, 0.0, 57.2, 0.0, 0.0] 18 [ 50.7, 0.0, 49.3, 0.0, 0.0] 19 [ 46.9, 0.0, 45.2, 7.9, 0.0] 20 [ 46.0, 0.0, 48.9, 5.1, 0.0] Top is even less help, it just shows the worker process on each node at >98% CPU. Thanks, David Mathog mat...@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech