Per the error message, can you try to
mpirun --mca btl_openib_if_include cxgb3_0 --mca
btl_openib_max_send_size 65536 ...
and see whether it helps ?
you can also try various settings for the receive queue, for example
edit your /.../share/openmpi/mca-btl-openib-device-params.ini and set
the parameters for your specific hardware
Cheers,
Gilles
On 3/8/2016 2:55 PM, dpchoudh . wrote:
Hello all
I am asking for help for the following situation:
I have two (mostly identical) nodes. Each of them have (completely
identical)
1. qlogic 4x DDR infiniband, AND
2. Chelsio S310E (T3 chip based) 10GE iWARP cards.
Both are connected back-to-back, without a switch. The connection is
physically OK and IP traffic can flow on both of them without issues.
The issue is, I can run MPI programs using the openib BTL using the
qlogic card, but not the Chelsio card. Here are the commands:
[durga@smallMPI ~]$ ibv_devices
device node GUID
------ ----------------
cxgb3_0 00074306cd3b0000 <-- Chelsio
qib0 0011750000ff831d <-- Qlogic
The following command works:
mpirun -np 2 --hostfile ~/hostfile -mca btl_openib_if_include qib0
./osu_acc_latency
And the following do not:
mpirun -np 2 --hostfile ~/hostfile -mca btl_openib_if_include cxgb3_0
./osu_acc_latency
mpirun -np 2 --hostfile ~/hostfile -mca pml ob1 -mca
btl_openib_if_include cxgb3_0 ./osu_acc_latency
mpirun -np 2 --hostfile ~/hostfile -mca pml ^cm -mca
btl_openib_if_include cxgb3_0 ./osu_acc_latency
The error I get is the following (in all of the non-working cases):
WARNING: The largest queue pair buffer size specified in the
btl_openib_receive_queues MCA parameter is smaller than the maximum
send size (i.e., the btl_openib_max_send_size MCA parameter), meaning
that no queue is large enough to receive the largest possible incoming
message fragment. The OpenFabrics (openib) BTL will therefore be
deactivated for this run.
Local host: smallMPI
Largest buffer size: 65536
Maximum send fragment size: 131072
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No OpenFabrics connection schemes reported that they were able to be
used on a specific port. As such, the openib BTL (OpenFabrics
support) will be disabled for this port.
Local host: bigMPI
Local device: cxgb3_0
Local port: 1
CPCs attempted: udcm
--------------------------------------------------------------------------
I have a vague understanding of what the message is trying to say, but
I do not know which file or configuration parameters to change to fix
the situation.
Thanks in advance
Durga
Life is complex. It has real and imaginary parts.
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2016/03/28657.php