+1.  Also, not all Ethernet switches are created equal -- particularly 
commodity 1GB Ethernet switches.  I've seen plenty of crappy Ethernet switches 
rated for 1GB that could not reach that speed when under load.



On Jul 10, 2012, at 10:47 AM, Ralph Castain wrote:

> I suspect it mostly reflects communication patterns. I don't know anything 
> about Saturne, but shared memory is a great deal faster than TCP, so the more 
> processes sharing a node the better. You may also be hitting some natural 
> boundary in your model - perhaps with 8 processes/node you wind up with more 
> processes that cross the node boundary, further increasing the communication 
> requirement.
> 
> Do things continue to get worse if you use all 4 nodes with 6 processes/node?
> 
> 
> On Jul 10, 2012, at 7:31 AM, Dugenoux Albert wrote:
> 
>> Hi.
>>  
>> I have recently built a cluster upon a Dell PowerEdge Server with a Debian 
>> 6.0 OS. This server is composed of
>> 4 system board of 2 processors of hexacores. So it gives 12 cores per system 
>> board.
>> The boards are linked with a local Gbits switch.
>>  
>> In order to parallelize the software Code Saturne, which is a CFD solver, I 
>> have configured the cluster
>> such that there are a pbs server/mom on 1 system board and 3 mom and the 3 
>> others cards. So this leads to
>> 48 cores dispatched on 4 nodes of 12 CPU. Code saturne is compiled with the 
>> openmpi 1.6 version.
>>  
>> When I launch a simulation using 2 nodes with 12 cores, elapse time is good 
>> and network traffic is not full.
>> But when I launch the same simulation using 3 nodes with 8 cores, elapse 
>> time is 5 times the previous one.
>> I both cases, I use 24 cores and network seems not to be satured.
>>  
>> I have tested several configurations : binaries in local file system or on a 
>> NFS. But results are the same.
>> I have visited severals forums (in particular 
>> http://www.open-mpi.org/community/lists/users/2009/08/10394.php)
>> and read lots of threads, but as I am not an expert at clusters, I presently 
>> do not see where it is wrong !
>>  
>> Is it a problem in the configuration of PBS (I have installed it from the 
>> deb packages), a subtile compilation options
>> of openMPI, or a bad network configuration ?
>>  
>> Regards.
>>  
>> B. S.
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to