Following your advices and those in the FAQ pages,
I have added the file
$(HOME)/.openmpi/mca-params.conf
with :
btl_mvapi_flags=6
mpi_leave_pinned=1
pml_ob1_leave_pinned_pipeline=1
mpool_base_use_mem_hooks=1
The parameterbtl_mvapi_eager_limit gives the best results, when set
to 8 K or 16 K.
The pingpong test result is now :
LOOPS: 1000 BYTES: 4096 SECONDS: 0.085643 MBytes/sec: 95.652825
LOOPS: 1000 BYTES: 8192 SECONDS: 0.050893 MBytes/sec: 321.931400
LOOPS: 1000 BYTES: 16384 SECONDS: 0.106791 MBytes/sec: 306.842281
LOOPS: 1000 BYTES: 32768 SECONDS: 0.154873 MBytes/sec: 423.159259
LOOPS: 1000 BYTES: 65536 SECONDS: 0.250849 MBytes/sec: 522.513526
LOOPS: 1000 BYTES: 131072 SECONDS: 0.443162 MBytes/sec: 591.530910
LOOPS: 1000 BYTES: 262144 SECONDS: 0.827640 MBytes/sec: 633.473448
LOOPS: 1000 BYTES: 524288 SECONDS: 1.596701 MBytes/sec: 656.714101
LOOPS: 1000 BYTES: 1048576 SECONDS: 3.134974 MBytes/sec: 668.953554
LOOPS: 1000 BYTES: 2097152 SECONDS: 6.210786 MBytes/sec: 675.325785
LOOPS: 1000 BYTES: 4194304 SECONDS: 12.384103 MBytes/sec: 677.369053
LOOPS: 1000 BYTES: 8388608 SECONDS: 27.377714 MBytes/sec: 612.805580
which is exactly what we can get also with mvapich on the same network.
Since we do NOT have a PCI-X hardware, I believe this is the maximum we
can get from the
hardware.
Thanks a lot for your explanations for this tunning of OpenMPI
Best Regards,
Jean
George Bosilca wrote:
On Thu, 16 Mar 2006, Jean Latour wrote:
My questions are :
a) Is OpenMPI doing in this case TCP/IP over IB ? (I guess so)
If the path to the mvapi library is correct then Open MPI will use mvapi
not TCP over IB. There is a simple way to check. "ompi_info --param btl
mvapi" will print all the parameters attached to the mvapi driver. If
there is no mvapi in the output, then mvapi was not correctly detected.
But I don't think it's the case, because if I remember well we have a
protection at configure time. If you specify one of the drivers and we're
not able to correctly use the libraries, we will stop the configure.
b) Is it possible to improve significantly these values by changing the
defaults ?
By default we are using a very conservative approach. We never leave the
memory pinned down, and that decrease the performance for a ping-pong.
There are pro and cons for that, too long to be explained here, but in
general we're seeing better performance for real-life applications with
our default approach, and that's our main goal.
Now, if you want to get better performance for the ping-pong test please
read the FAQ at http://www.open-mpi.org/faq/?category=infiniband.
These are the 3 flags that affect the mvapi performance for the ping-pong
case (add them in $(HOME)/.openmpi/mca-params.conf):
btl_mvapi_flags=6
mpi_leave_pinned=1
pml_ob1_leave_pinned_pipeline=1
I have used several mca btl parameters but without improving the maximum
bandwith.
For example : --mca btl mvapi --mca btl_mvapi_max_send_size 8388608
It is difficult to improve the maximum bandwidth without the leave_pinned
activated. But you can improve the bandwidth for medium size messages.
Play with btl_mvapi_eager_limit to set the limit between short and
rendez-vous protocol. "ompi_info --param btl mvapi" will give you a full
list of parameters as well as their description.
c) Is it possible that other IB hardware implementations have better
performances with OpenMPI ?
The maximum bandwidth depend on several factors. One of the most
importants is the maximum bandwidth on your node bus. To reach 800 and
more MB/s you definitively need a PCI-X 16 ...
d) Is it possible to use specific IB drivers for optimal performance ?
(should reach almost 800 MB/sec)
Once the 3 options are set, you should see an improvement on the
bandwidth.
Let me know if it does not solve your problems.
george.
"We must accept finite disappointment, but we must never lose infinite
hope."
Martin Luther King
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
<>