+1. Also, not all Ethernet switches are created equal -- particularly commodity 1GB Ethernet switches. I've seen plenty of crappy Ethernet switches rated for 1GB that could not reach that speed when under load.
On Jul 10, 2012, at 10:47 AM, Ralph Castain wrote: > I suspect it mostly reflects communication patterns. I don't know anything > about Saturne, but shared memory is a great deal faster than TCP, so the more > processes sharing a node the better. You may also be hitting some natural > boundary in your model - perhaps with 8 processes/node you wind up with more > processes that cross the node boundary, further increasing the communication > requirement. > > Do things continue to get worse if you use all 4 nodes with 6 processes/node? > > > On Jul 10, 2012, at 7:31 AM, Dugenoux Albert wrote: > >> Hi. >> >> I have recently built a cluster upon a Dell PowerEdge Server with a Debian >> 6.0 OS. This server is composed of >> 4 system board of 2 processors of hexacores. So it gives 12 cores per system >> board. >> The boards are linked with a local Gbits switch. >> >> In order to parallelize the software Code Saturne, which is a CFD solver, I >> have configured the cluster >> such that there are a pbs server/mom on 1 system board and 3 mom and the 3 >> others cards. So this leads to >> 48 cores dispatched on 4 nodes of 12 CPU. Code saturne is compiled with the >> openmpi 1.6 version. >> >> When I launch a simulation using 2 nodes with 12 cores, elapse time is good >> and network traffic is not full. >> But when I launch the same simulation using 3 nodes with 8 cores, elapse >> time is 5 times the previous one. >> I both cases, I use 24 cores and network seems not to be satured. >> >> I have tested several configurations : binaries in local file system or on a >> NFS. But results are the same. >> I have visited severals forums (in particular >> http://www.open-mpi.org/community/lists/users/2009/08/10394.php) >> and read lots of threads, but as I am not an expert at clusters, I presently >> do not see where it is wrong ! >> >> Is it a problem in the configuration of PBS (I have installed it from the >> deb packages), a subtile compilation options >> of openMPI, or a bad network configuration ? >> >> Regards. >> >> B. S. >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/