Re: [OMPI users] Bad performance - OpenIB 1.2.3

Jeff Squyres Sat, 6 Oct 2007 12:23:04 -0400

Sorry for the delay in replying -- you sent this right before many ofus left for Europe for a conference and subsequent OMPI engineeringmeetings. I'm just now getting to much of the list mail that haspiled up since then...


What you describe is darn weird.  :-(

I know that this is likely to be an expected answer, but: is thereany chance you can try upgrading to a more recent version of OMPI?Also, this may be a dumb question, but just to be sure: did you runompi_info and ensure that you have an openib BTL component installed?

FWIW, we do not yet have a "positive ACK" way to know which networksyou're using (I have an open ticket about it for v1.3...), meaningthat OMPI doesn't show which networks you're using. It will,however, give you a negative ACK if you're *not* using a high-speednetwork that OMPI was configured for. Specifically, if you have anopenib BTL installed and it is not used because it can't find anyactive HCA ports, then the openib BTL will complain.

You can also force the use of specific networks with the "btl" MCAparameter, such as:


    mpirun --mca btl openib,self ...

Then, if openib is not able to be used, the run will likely barfbecause it won't be able to establish MPI communications.



On Sep 21, 2007, at 1:20 AM, Troy Telford wrote:

I'm running Intel's IMB benchmark over an InfiniBand cluster;though other
benchmarks that Open MPI has done fine in the past are also performing
poorly.
The cluster has DDR IB, and the fabric isn't seeing the kind ofsymbol errorsthat indicate a bad fabric; (non-mpi) bandwidth tests over the IBfabric are
in the expected range.
When the number of processes in IMB becomes greater than one nodecan handle,the bandwidth reported by IMB's 'Sendrecv', and 'Exchange' testdrops from1.9 GB/sec (4 process - or one process per core in the first node)to 20
MB/sec over 8 processes (and two nodes).
In other words, when we move from using shared memory and 'self' toan actualnetwork interface, IMB reports _really_ lousy performance, lower by30x thanI've recorded for SDR IB. (For the same test with a differentcluster usingSDR IB & Open MPI, I've clocked ~650 MB/sec - quite a bit higherthan 20
MB/sec)
On this cluster, however IMB's reported bandwidth remains the samefrom 2-36
nodes, over DDR InfiniBand:  ~20 MB/sec

We've used the OFED 1.1.1 and 1.2 driver releases so far.

the command line is pretty simple:
mpirun -np 128 -machinefile <foo> -mca btl openib,sm,self ./IMB-MPI1
As far as I'm aware, our command-line excludes TCP/IP (and henceethernet)from being used; yet we're seeing speeds that are far below theabilities of
InfiniBand.
I've used Open MPI quite a bit, since before the 1.0 days; I'vebeen dealingwith IB for even longer. (And the guy I'm writing in behalf of hasused Open
MPI on large IB systems as well).
Even when we specify that only the 'openib' module be used, we areseeing 20
MB/sec.
Oddly enough, the management ethernet is 10/100, and 20 MB/secseems 'in thesame ballpark' as would be reported by IMB when 10/100 ethernet isused.
We aren't receiving any error messages from Open MPI. (As normallyyou would
when part of the fabric is down.)
So we're left a bit stumped: We're getting speeds you would expectfrom 100Mbit ethernet, but we're specifying the IB interface, and notreceiving anyerrors from Open MPI. There isn't an unusual number of symbolerrors (ie.errors are low, not increasing, etc.) on the IB fabric, the SM isup and
operational.
One more tidbit that is probably insignificant, but I'll mentionanyway: Weare running IBM's GPFS via IPoIB, so there is a little bit of IBtraffic fromGPFS - which is also a configuration we've used with no problems inthe past.
Any ideas on what I can do to verify that OpenMPI is in fact usingthe IB
fabric?
--
Troy Telford
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

Re: [OMPI users] Bad performance - OpenIB 1.2.3

Reply via email to