Hi,
Hardware setups:
+ We have 4 NICs for each node in our cluster. That is why I called 4 links.
+ All nodes are connected by 4 switches (1Gb switch).
+ 4GB memory for each node.
How can I check that NUMA or UMA memory access?

Thank you!




________________________________
From: Jeff Squyres <jsquy...@cisco.com>
To: Open MPI Users <us...@open-mpi.org>
Sent: Thursday, April 23, 2009 8:23:52 PM
Subject: Re: [OMPI users] MPI_Bcast from OpenMPI

Very strange; 6 seconds for a 1MB broadcast over 64 processes is *way* too 
long.  Even 2.5 sec at 2MB seems too long -- what is your network speed?  I'm 
not entirely sure what you mean by "4 link" on your graph.

Without more information, I would first check your hardware setup to see if 
there's some kind of network buffering / congestion issue occurring.  Here's a 
total guess: your ethernet switch(es) are low quality (from an HPC perspective, 
at least) such that you're incurring congestion and/or retransmission at that 
size for some reason.

You could also be running up against memory bus congestion (I assume you mean 4 
cores per node; are they NUMA or UMA?).  But that wouldn't account for the huge 
spike at 1MB.


On Apr 23, 2009, at 1:32 AM, shan axida wrote:

> Hi,
> One more question:
> I have executed the MPI_Bcast() in 64 processes in 16 nodes Ethernet multiple 
> links cluster.
> The result is shown in the file attached on this E-mail.
> What is going on at 131072 double message size?
> I have executed it many times but the result is still the same.
> 
> THANK YOU!
> 
> 
> 
> 
> 
> 
> 
> <openmpi.pdf>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--Jeff Squyres
Cisco Systems

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



      

Reply via email to