On 07/17/2017 01:06 PM, Gus Correa wrote:
Hi Boris

The nodes may have standard Gigabit Ethernet interfaces,
besides the Infiniband (RoCE).
You may want to direct OpenMPI to use the Infiniband interfaces,
not Gigabit Ethernet,
by adding something like this to "--mca btl self,vader,self":
Oops! Typo:
"--mca btl self,vader,tcp"

"--mca btl_tcp_if_include ib0,ib1"

(Where the interface names ib0,ib1 are just my guess for
what your nodes may have. Check with your "root" system administrator!)

That syntax may also use IP address, or a subnet mask,
whichever it is simpler for you.
It is better explained in this FAQ:

https://www.open-mpi.org/faq/?category=all#tcp-selection

BTW, some of your questions (and others that you may hit later)
are covered in the OpenMPI FAQ:

https://www.open-mpi.org/faq/?category=all

I hope this helps,
Gus Correa


On 07/17/2017 12:43 PM, Boris M. Vulovic wrote:
Gus, Gilles, Russell, John:

Thanks very much for the replies and the help.
I got confirmation from the "root" that it is indeed RoCE with 100G.

I'll go over the info in the link Russell provided, but have a quick question: if I run the "*mpiexec*" with "*-mca btl tcp,self*" do I get the benefit of *RoCE *(the fastest speed)?

I'll go over the details of all reply and post useful feedback.

Thanks very much all!

Best,

--Boris




On Mon, Jul 17, 2017 at 6:31 AM, Russell Dekema <deke...@umich.edu <mailto:deke...@umich.edu>> wrote:

    It looks like you have two dual-port Mellanox VPI cards in this
    machine. These cards can be set to run InfiniBand or Ethernet on a
    port-by-port basis, and all four of your ports are set to Ethernet
    mode. Two of your ports have active 100 gigabit Ethernet links, and
    the other two have no link up at all.

    With no InfiniBand links on the machine, you will, of course, not be
    able to run your OpenMPI job over InfiniBand.

    If your machines and network are set up for it, you might be able to
    run your job over RoCE (RDMA Over Converged Ethernet) using one or
    both of those 100 GbE links. I have never used RoCE myself, but one
    starting point for gathering more information on it might be the
    following section of the OpenMPI FAQ:

    https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce
    <https://www.open-mpi.org/faq/?category=openfabrics#ompi-over-roce>

    Sincerely,
    Rusty Dekema
    University of Michigan
    Advanced Research Computing - Technology Services


    On Fri, Jul 14, 2017 at 12:34 PM, Boris M. Vulovic
    <boris.m.vulo...@gmail.com <mailto:boris.m.vulo...@gmail.com>> wrote:
     > Gus, Gilles and John,
     >
     > Thanks for the help. Let me first post (below) the output from
    checkouts of
     > the IB network:
     > ibdiagnet
     > ibhosts
     > ibstat  (for login node, for now)
     >
     > What do you think?
     > Thanks
     > --Boris
     >
     >
     >
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     >
     > -bash-4.1$ ibdiagnet
     > ----------
     > Load Plugins from:
     > /usr/share/ibdiagnet2.1.1/plugins/
     > (You can specify more paths to be looked in with
    "IBDIAGNET_PLUGINS_PATH"
     > env variable)
     >
     > Plugin Name                                   Result     Comment
     > libibdiagnet_cable_diag_plugin-2.1.1          Succeeded  Plugin
    loaded
     > libibdiagnet_phy_diag_plugin-2.1.1            Succeeded  Plugin
    loaded
     >
     > ---------------------------------------------
     > Discovery
     > -E- Failed to initialize
     >
     > -E- Fabric Discover failed, err=IBDiag initialize wasn't done
     > -E- Fabric Discover failed, MAD err=Failed to register SMI class
     >
     > ---------------------------------------------
     > Summary
     > -I- Stage                     Warnings   Errors     Comment
     > -I- Discovery                                       NA
     > -I- Lids Check                                      NA
     > -I- Links Check                                     NA
     > -I- Subnet Manager                                  NA
     > -I- Port Counters                                   NA
     > -I- Nodes Information                               NA
     > -I- Speed / Width checks                            NA
     > -I- Partition Keys                                  NA
     > -I- Alias GUIDs                                     NA
     > -I- Temperature Sensing                             NA
     >
     > -I- You can find detailed errors/warnings in:
     > /var/tmp/ibdiagnet2/ibdiagnet2.log
     >
     > -E- A fatal error occurred, exiting...
     > -bash-4.1$
     >
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     >
     > -bash-4.1$ ibhosts
> ibwarn: [168221] mad_rpc_open_port: client_register for mgmt 1 failed
     > src/ibnetdisc.c:766; can't open MAD port ((null):0)
     > /usr/sbin/ibnetdiscover: iberror: failed: discover failed
     > -bash-4.1$
     >
     >
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     > -bash-4.1$ ibstat
     > CA 'mlx5_0'
     >         CA type: MT4115
     >         Number of ports: 1
     >         Firmware version: 12.17.2020
     >         Hardware version: 0
     >         Node GUID: 0x248a0703005abb1c
     >         System image GUID: 0x248a0703005abb1c
     >         Port 1:
     >                 State: Active
     >                 Physical state: LinkUp
     >                 Rate: 100
     >                 Base lid: 0
     >                 LMC: 0
     >                 SM lid: 0
     >                 Capability mask: 0x3c010000
     >                 Port GUID: 0x268a07fffe5abb1c
     >                 Link layer: Ethernet
     > CA 'mlx5_1'
     >         CA type: MT4115
     >         Number of ports: 1
     >         Firmware version: 12.17.2020
     >         Hardware version: 0
     >         Node GUID: 0x248a0703005abb1d
     >         System image GUID: 0x248a0703005abb1c
     >         Port 1:
     >                 State: Active
     >                 Physical state: LinkUp
     >                 Rate: 100
     >                 Base lid: 0
     >                 LMC: 0
     >                 SM lid: 0
     >                 Capability mask: 0x3c010000
     >                 Port GUID: 0x0000000000000000
     >                 Link layer: Ethernet
     > CA 'mlx5_2'
     >         CA type: MT4115
     >         Number of ports: 1
     >         Firmware version: 12.17.2020
     >         Hardware version: 0
     >         Node GUID: 0x248a0703005abb30
     >         System image GUID: 0x248a0703005abb30
     >         Port 1:
     >                 State: Down
     >                 Physical state: Disabled
     >                 Rate: 100
     >                 Base lid: 0
     >                 LMC: 0
     >                 SM lid: 0
     >                 Capability mask: 0x3c010000
     >                 Port GUID: 0x268a07fffe5abb30
     >                 Link layer: Ethernet
     > CA 'mlx5_3'
     >         CA type: MT4115
     >         Number of ports: 1
     >         Firmware version: 12.17.2020
     >         Hardware version: 0
     >         Node GUID: 0x248a0703005abb31
     >         System image GUID: 0x248a0703005abb30
     >         Port 1:
     >                 State: Down
     >                 Physical state: Disabled
     >                 Rate: 100
     >                 Base lid: 0
     >                 LMC: 0
     >                 SM lid: 0
     >                 Capability mask: 0x3c010000
     >                 Port GUID: 0x268a07fffe5abb31
     >                 Link layer: Ethernet
     > -bash-4.1$
     >
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     >
     > On Fri, Jul 14, 2017 at 12:37 AM, John Hearns via users
> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote:
     >>
>> ABoris, as Gilles says - first do som elower level checkouts of your
     >> Infiniband network.
     >> I suggest running:
     >> ibdiagnet
     >> ibhosts
     >> and then as Gilles says 'ibstat' on each node
     >>
     >>
     >>
     >> On 14 July 2017 at 03:58, Gilles Gouaillardet <gil...@rist.or.jp
    <mailto:gil...@rist.or.jp>> wrote:
     >>>
     >>> Boris,
     >>>
     >>>
     >>> Open MPI should automatically detect the infiniband hardware,
    and use
     >>> openib (and *not* tcp) for inter node communications
     >>>
>>> and a shared memory optimized btl (e.g. sm or vader) for intra node
     >>> communications.
     >>>
     >>>
     >>> note if you "-mca btl openib,self", you tell Open MPI to use
    the openib
     >>> btl between any tasks,
     >>>
     >>> including tasks running on the same node (which is less
    efficient than
     >>> using sm or vader)
     >>>
     >>>
     >>> at first, i suggest you make sure infiniband is up and running
    on all
     >>> your nodes.
     >>>
     >>> (just run ibstat, at least one port should be listed, state
    should be
     >>> Active, and all nodes should have the same SM lid)
     >>>
     >>>
     >>> then try to run two tasks on two nodes.
     >>>
     >>>
     >>> if this does not work, you can
     >>>
     >>> mpirun --mca btl_base_verbose 100 ...
     >>>
     >>> and post the logs so we can investigate from there.
     >>>
     >>>
     >>> Cheers,
     >>>
     >>>
     >>> Gilles
     >>>
     >>>
     >>>
     >>> On 7/14/2017 6:43 AM, Boris M. Vulovic wrote:
     >>>>
     >>>>
     >>>> I would like to know how to invoke InfiniBand hardware on
    CentOS 6x
     >>>> cluster with OpenMPI (static libs.) for running my C++ code.
    This is how I
     >>>> compile and run:
     >>>>
     >>>> /usr/local/open-mpi/1.10.7/bin/mpic++
    -L/usr/local/open-mpi/1.10.7/lib
     >>>> -Bstatic main.cpp -o DoWork
     >>>>
>>>> usr/local/open-mpi/1.10.7/bin/mpiexec -mca btl tcp,self --hostfile >>>> hostfile5 -host node01,node02,node03,node04,node05 -n 200 DoWork
     >>>>
     >>>> Here, "*-mca btl tcp,self*" reveals that *TCP* is used, and
    the cluster
     >>>> has InfiniBand.
     >>>>
     >>>> What should be changed in compiling and running commands for
    InfiniBand
     >>>> to be invoked? If I just replace "*-mca btl tcp,self*" with
    "*-mca btl
     >>>> openib,self*" then I get plenty of errors with relevant one
    saying:
     >>>>
     >>>> /At least one pair of MPI processes are unable to reach each
    other for
     >>>> MPI communications. This means that no Open MPI device has
    indicated that it
     >>>> can be used to communicate between these processes. This is an
    error; Open
     >>>> MPI requires that all MPI processes be able to reach each
    other. This error
     >>>> can sometimes be the result of forgetting to specify the
    "self" BTL./
     >>>>
     >>>> Thanks very much!!!
     >>>>
     >>>>
     >>>> *Boris *
     >>>>
     >>>>
     >>>>
     >>>>
     >>>> _______________________________________________
     >>>> users mailing list
     >>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
     >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
     >>>
     >>>
     >>> _______________________________________________
     >>> users mailing list
     >>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
     >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
     >>
     >>
     >>
     >> _______________________________________________
     >> users mailing list
     >> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
     >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
     >
     >
     >
     >
     > --
     >
     > Boris M. Vulovic
     >
     >
     >
     > _______________________________________________
     > users mailing list
     > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
     > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
    _______________________________________________
    users mailing list
    users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
    https://rfd.newmexicoconsortium.org/mailman/listinfo/users
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>




--

*Boris M. Vulovic*




_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to