Unfortunately, I can hardly imagine where the performance problems are coming from. Usually I get more than 97% out of the raw TCP performance with Open MPI. There are two parameters hat can slightly improve the behavior: btl_tcp_rdma_pipeline_send_length and btl_tcp_min_rdma_pipeline_size. Please use "ompi_info --param btl tcp" to get more info about them. Try to push them up a bit, and then try to set them to the maximum (UINT_MAX). If there is any other device available except the Chelsio one then btl_tcp_bandwidth and btl_tcp_latency might prove interesting.

Btw, can you run the Netpipe benchmark on this configuration please ? Once compiled with MPI support and once with TCP. This might give us more equitable details (same benchmark).

  george.

On Aug 18, 2008, at 9:36 PM, Steve Wise wrote:

Andy Georgi wrote:
Steve Wise wrote:
Are you using Chelsio's TOE drivers? Or just a driver from the distro?

We use the Chelsio TOE drivers.


Steve Wise wrote:
Ok.  Did you run their perftune.sh script?

Yes, if not we wouldn't get the 1.15 GB/s on the TCP level. We had ~800 MB/s before
primarily because of too small TCP buffers.

The difference of more than 200 MB/s between 1.15 GB/s we get with iperf and the 930 MB/s we measured with a MPI-Ping-Pong test is too large, i think. Something in Open MPI seems to slow it down.

Sounds like the TOE setup is tweaked to get decent sockets performance.

So OMPI experts, what is the overhead you see on other TCP links for OMPI BW tests vs native sockets TCP BW tests?

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to