1.  MCA BTL parameters
With "-mca btl openib,self", both message between two Cell processors on
one QS22 and   messages between two QS22s go through IB.

With "-mca btl openib,sm,slef",  message on one QS22 go through shared
memory,  message between QS22 go through IB,

Depending on the message size and other MCA parameters,  it does not
guarantee message passing on shared memory is faster than on IB.   E.g.
the bandwidth for 64KB message is 959MB/s on shared-memory and is 694MB/s
on IB;  the bandwidth for 4MB message is 539 MB/s and 1092 MB/s on  IB.
The bandwidth of 4MB message on shared memory may be higher if you tune
some MCA parameter.

2.  mpi_paffinity_alone
  "mpi_paffinity_alone =1"  is not a good choice for QS22.  There are two
sockets with two physical  Cell/B.E. on one QS22.  Each Cell/B.E. has two
SMT threads.   So there are four logical CPUs on one QS22.  CBE Linux
kernel maps logical cpu 0 and 1 to socket1 and maps logical cpu 1 and 2 to
socket 2.    If mpi_paffinity_alone is set to 1,   the two MPI instances
will be assigned to logical cpu 0 and cpu 1 on socket 1.  I believe this is
not what you want.

    A temporaily solution to  force the affinity on  QS22 is to use
"numactl",   E.g.  assuming the hostname is "qs22" and the executable is
"foo".  the following command can be used
                mpirun -np 1 -H qs22 numactl -c0 -m0  foo :   -np 1 -H qs22
numactl -c1 -m1 foo

   In the long run,  I wish CBE kernel export  CPU topology  in /sys  and
use  PLPA to force the processor affinity.

Best Regards,
Mi



                                                                       
             "Lenny                                                    
             Verkhovsky"                                               
             <lenny.verkhovsky                                          To
             @gmail.com>               "Open MPI Users"                
             Sent by:                  <us...@open-mpi.org>            
             users-bounces@ope                                          cc
             n-mpi.org                                                 
                                                                   Subject
                                       Re: [OMPI users] Working with a 
             10/23/2008 05:48          CellBlade cluster               
             AM                                                        
                                                                       
                                                                       
             Please respond to                                         
              Open MPI Users                                           
             <users@open-mpi.o                                         
                    rg>                                                
                                                                       
                                                                       




Hi,


If I understand you correctly the most suitable way to do it is by
paffinity that we have in Open MPI 1.3 and the trank.
how ever usually OS is distributing processes evenly between sockets by it
self.

There still no formal FAQ due to a multiple reasons but you can read how to
use it in the attached scratch ( there were few name changings of the
params, so check with ompi_info )

shared memory is used between processes that share same machine, and openib
is used between different machines ( hostnames ), no special mca params are
needed.

Best Regards
Lenny,







On Sun, Oct 19, 2008 at 10:32 AM, Gilbert Grosdidier <gro...@mail.cern.ch>
wrote:
   Working with a CellBlade cluster (QS22), the requirement is to have one
  instance of the executable running on each socket of the blade (there are
  2
  sockets). The application is of the 'domain decomposition' type, and each
  instance is required to often send/receive data with both the remote
  blades and
  the neighbor socket.

   Question is : which specification must be used for the mca btl component
  to force 1) shmem type messages when communicating with this neighbor
  socket,
  while 2) using openib to communicate with the remote blades ?
  Is '-mca btl sm,openib,self' suitable for this ?

   Also, which debug flags could be used to crosscheck that the messages
  are
  _actually_ going thru the right channel for a given channel, please ?

   We are currently using OpenMPI 1.2.5 shipped with RHEL5.2 (ppc64).
  Which version do you think is currently the most optimised for these
  processors and problem type ? Should we go towards OpenMPI 1.2.8
  instead ?
  Or even try some OpenMPI 1.3 nightly build ?

   Thanks in advance for your help,                  Gilbert.

  _______________________________________________
  users mailing list
  us...@open-mpi.org
  http://www.open-mpi.org/mailman/listinfo.cgi/users
(See attached file: RANKS_FAQ.doc)
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Attachment: RANKS_FAQ.doc
Description: MS-Word document

Reply via email to