Jody,
Just to make sure, you did set processor affinity during your test
right?
On Jul 13, 2009, at 9:28 PM, Klymak Jody wrote:
Hi Robert,
I got inspired by your question to run a few more tests. They are
crude, and I don't have actual cpu timing information because of a
library mismatch. However:
Setup:
Xserve, 2x2.26 GHz Quad-core Intel Xeon
6.0 Gb memory 1067 MHz DDR3
Mac OS X 10.5.6
Nodes are connected with a dedicated gigabit ethernet switch.
I'm running the MITgcm, a nonhydrostatic global circulation model.
The grid size is modest: 10x150x1600, so bear that in mind.
Message passing is on the dimension that is 150x10, and typically
is 3 grid cells in either direction. I'm not sure how many
variables are passed, but I would guess on the order of 24.
I turned off all the I/O I knew of to reduce disk latency.
1 node: 8 processes: 54 minutes
1 node: 16 processes: 40 minutes (oversubscribed)
2 nodes, 16 processes: 29 minutes
So, oversubscribing was faster (in this case), but it didn't double
the speed. Certainly spreading the load to another node was much
faster.
I haven't had a chance to implement Warner's suggestion of turning
hyperthreading off to see what affect that has on the speed.
Cheers, Jody
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users