Gus, Gilles, Russell, John:
Thanks very much for the replies and the help.
I got confirmation from the "root" that it is indeed RoCE with 100G.
I'll go over the info in the link Russell provided, but have a quick
question: if I run the "*mpiexec*" with "*-mca btl tcp,self*" do I get
the benefit of *RoCE *(the fastest speed)?
I'll go over the details of all reply and post useful feedback.
Thanks very much all!
Best,
--Boris
On Mon, Jul 17, 2017 at 6:31 AM, Russell Dekema <deke...@umich.edu
<mailto:deke...@umich.edu>> wrote:
It looks like you have two dual-port Mellanox VPI cards in this
machine. These cards can be set to run InfiniBand or Ethernet on a
port-by-port basis, and all four of your ports are set to Ethernet
mode. Two of your ports have active 100 gigabit Ethernet links, and
the other two have no link up at all.
With no InfiniBand links on the machine, you will, of course, not be
able to run your OpenMPI job over InfiniBand.
If your machines and network are set up for it, you might be able to
run your job over RoCE (RDMA Over Converged Ethernet) using one or
both of those 100 GbE links. I have never used RoCE myself, but one
starting point for gathering more information on it might be the
following section of the OpenMPI FAQ:
https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce
<https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce>
Sincerely,
Rusty Dekema
University of Michigan
Advanced Research Computing - Technology Services
On Fri, Jul 14, 2017 at 12:34 PM, Boris M. Vulovic
<boris.m.vulo...@gmail.com <mailto:boris.m.vulo...@gmail.com>> wrote:
> Gus, Gilles and John,
>
> Thanks for the help. Let me first post (below) the output from
checkouts of
> the IB network:
> ibdiagnet
> ibhosts
> ibstat (for login node, for now)
>
> What do you think?
> Thanks
> --Boris
>
>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> -bash-4.1$ ibdiagnet
> ----------
> Load Plugins from:
> /usr/share/ibdiagnet2.1.1/plugins/
> (You can specify more paths to be looked in with
"IBDIAGNET_PLUGINS_PATH"
> env variable)
>
> Plugin Name Result Comment
> libibdiagnet_cable_diag_plugin-2.1.1 Succeeded Plugin
loaded
> libibdiagnet_phy_diag_plugin-2.1.1 Succeeded Plugin
loaded
>
> ---------------------------------------------
> Discovery
> -E- Failed to initialize
>
> -E- Fabric Discover failed, err=IBDiag initialize wasn't done
> -E- Fabric Discover failed, MAD err=Failed to register SMI class
>
> ---------------------------------------------
> Summary
> -I- Stage Warnings Errors Comment
> -I- Discovery NA
> -I- Lids Check NA
> -I- Links Check NA
> -I- Subnet Manager NA
> -I- Port Counters NA
> -I- Nodes Information NA
> -I- Speed / Width checks NA
> -I- Partition Keys NA
> -I- Alias GUIDs NA
> -I- Temperature Sensing NA
>
> -I- You can find detailed errors/warnings in:
> /var/tmp/ibdiagnet2/ibdiagnet2.log
>
> -E- A fatal error occurred, exiting...
> -bash-4.1$
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> -bash-4.1$ ibhosts
> ibwarn: [168221] mad_rpc_open_port: client_register for mgmt 1
failed
> src/ibnetdisc.c:766; can't open MAD port ((null):0)
> /usr/sbin/ibnetdiscover: iberror: failed: discover failed
> -bash-4.1$
>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> -bash-4.1$ ibstat
> CA 'mlx5_0'
> CA type: MT4115
> Number of ports: 1
> Firmware version: 12.17.2020
> Hardware version: 0
> Node GUID: 0x248a0703005abb1c
> System image GUID: 0x248a0703005abb1c
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 100
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x3c010000
> Port GUID: 0x268a07fffe5abb1c
> Link layer: Ethernet
> CA 'mlx5_1'
> CA type: MT4115
> Number of ports: 1
> Firmware version: 12.17.2020
> Hardware version: 0
> Node GUID: 0x248a0703005abb1d
> System image GUID: 0x248a0703005abb1c
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 100
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x3c010000
> Port GUID: 0x0000000000000000
> Link layer: Ethernet
> CA 'mlx5_2'
> CA type: MT4115
> Number of ports: 1
> Firmware version: 12.17.2020
> Hardware version: 0
> Node GUID: 0x248a0703005abb30
> System image GUID: 0x248a0703005abb30
> Port 1:
> State: Down
> Physical state: Disabled
> Rate: 100
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x3c010000
> Port GUID: 0x268a07fffe5abb30
> Link layer: Ethernet
> CA 'mlx5_3'
> CA type: MT4115
> Number of ports: 1
> Firmware version: 12.17.2020
> Hardware version: 0
> Node GUID: 0x248a0703005abb31
> System image GUID: 0x248a0703005abb30
> Port 1:
> State: Down
> Physical state: Disabled
> Rate: 100
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x3c010000
> Port GUID: 0x268a07fffe5abb31
> Link layer: Ethernet
> -bash-4.1$
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> On Fri, Jul 14, 2017 at 12:37 AM, John Hearns via users
> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>
wrote:
>>
>> ABoris, as Gilles says - first do som elower level checkouts
of your
>> Infiniband network.
>> I suggest running:
>> ibdiagnet
>> ibhosts
>> and then as Gilles says 'ibstat' on each node
>>
>>
>>
>> On 14 July 2017 at 03:58, Gilles Gouaillardet <gil...@rist.or.jp
<mailto:gil...@rist.or.jp>> wrote:
>>>
>>> Boris,
>>>
>>>
>>> Open MPI should automatically detect the infiniband hardware,
and use
>>> openib (and *not* tcp) for inter node communications
>>>
>>> and a shared memory optimized btl (e.g. sm or vader) for
intra node
>>> communications.
>>>
>>>
>>> note if you "-mca btl openib,self", you tell Open MPI to use
the openib
>>> btl between any tasks,
>>>
>>> including tasks running on the same node (which is less
efficient than
>>> using sm or vader)
>>>
>>>
>>> at first, i suggest you make sure infiniband is up and running
on all
>>> your nodes.
>>>
>>> (just run ibstat, at least one port should be listed, state
should be
>>> Active, and all nodes should have the same SM lid)
>>>
>>>
>>> then try to run two tasks on two nodes.
>>>
>>>
>>> if this does not work, you can
>>>
>>> mpirun --mca btl_base_verbose 100 ...
>>>
>>> and post the logs so we can investigate from there.
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>>
>>> On 7/14/2017 6:43 AM, Boris M. Vulovic wrote:
>>>>
>>>>
>>>> I would like to know how to invoke InfiniBand hardware on
CentOS 6x
>>>> cluster with OpenMPI (static libs.) for running my C++ code.
This is how I
>>>> compile and run:
>>>>
>>>> /usr/local/open-mpi/1.10.7/bin/mpic++
-L/usr/local/open-mpi/1.10.7/lib
>>>> -Bstatic main.cpp -o DoWork
>>>>
>>>> usr/local/open-mpi/1.10.7/bin/mpiexec -mca btl tcp,self
--hostfile
>>>> hostfile5 -host node01,node02,node03,node04,node05 -n 200
DoWork
>>>>
>>>> Here, "*-mca btl tcp,self*" reveals that *TCP* is used, and
the cluster
>>>> has InfiniBand.
>>>>
>>>> What should be changed in compiling and running commands for
InfiniBand
>>>> to be invoked? If I just replace "*-mca btl tcp,self*" with
"*-mca btl
>>>> openib,self*" then I get plenty of errors with relevant one
saying:
>>>>
>>>> /At least one pair of MPI processes are unable to reach each
other for
>>>> MPI communications. This means that no Open MPI device has
indicated that it
>>>> can be used to communicate between these processes. This is an
error; Open
>>>> MPI requires that all MPI processes be able to reach each
other. This error
>>>> can sometimes be the result of forgetting to specify the
"self" BTL./
>>>>
>>>> Thanks very much!!!
>>>>
>>>>
>>>> *Boris *
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>
>
>
>
> --
>
> Boris M. Vulovic
>
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
_______________________________________________
users mailing list
users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
--
*Boris M. Vulovic*
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users