BTLs attempted: self

That should only allow a single process to communicate with its self




On 19 May 2017 at 09:23, Gabriele Fatigati <g.fatig...@cineca.it> wrote:

> Oh no, by using two procs:
>
>
> findActiveDevices Error
> We found no active IB device ports
> findActiveDevices Error
> We found no active IB device ports
> --------------------------------------------------------------------------
> At least one pair of MPI processes are unable to reach each other for
> MPI communications.  This means that no Open MPI device has indicated
> that it can be used to communicate between these processes.  This is
> an error; Open MPI requires that all MPI processes be able to reach
> each other.  This error can sometimes be the result of forgetting to
> specify the "self" BTL.
>
>   Process 1 ([[12380,1],0]) is on host: openpower
>   Process 2 ([[12380,1],1]) is on host: openpower
>   BTLs attempted: self
>
> Your MPI job is now going to abort; sorry.
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***    and potentially your MPI job)
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***    and potentially your MPI job)
> --------------------------------------------------------------------------
> MPI_INIT has failed because at least one MPI process is unreachable
> from another.  This *usually* means that an underlying communication
> plugin -- such as a BTL or an MTL -- has either not loaded or not
> allowed itself to be used.  Your MPI job will now abort.
>
> You may wish to try to narrow down the problem;
>  * Check the output of ompi_info to see which BTL/MTL plugins are
>    available.
>  * Run your application with MPI_THREAD_SINGLE.
>  * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
>    if using MTL-based communications) to see exactly which
>    communication plugins were considered and/or discarded.
> --------------------------------------------------------------------------
> [openpower:88867] 1 more process has sent help message help-mca-bml-r2.txt
> / unreachable proc
> [openpower:88867] Set MCA parameter "orte_base_help_aggregate" to 0 to see
> all help / error messages
> [openpower:88867] 1 more process has sent help message
> help-mpi-runtime.txt / mpi_init:startup:pml-add-procs-fail
>
>
>
>
>
> 2017-05-19 9:22 GMT+02:00 Gabriele Fatigati <g.fatig...@cineca.it>:
>
>> Hi GIlles,
>>
>> using your command with one MPI procs I get:
>>
>> findActiveDevices Error
>> We found no active IB device ports
>> Hello world from rank 0  out of 1 processors
>>
>> So it seems to work apart the error message.
>>
>>
>> 2017-05-19 9:10 GMT+02:00 Gilles Gouaillardet <gil...@rist.or.jp>:
>>
>>> Gabriele,
>>>
>>>
>>> so it seems pml/pami assumes there is an infiniband card available (!)
>>>
>>> i guess IBM folks will comment on that shortly.
>>>
>>>
>>> meanwhile, you do not need pami since you are running on a single node
>>>
>>> mpirun --mca pml ^pami ...
>>>
>>> should do the trick
>>>
>>> (if it does not work, can run and post the logs)
>>>
>>> mpirun --mca pml ^pami --mca pml_base_verbose 100 ...
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>> On 5/19/2017 4:01 PM, Gabriele Fatigati wrote:
>>>
>>>> Hi John,
>>>> Infiniband is not used, there is a single node on this machine.
>>>>
>>>> 2017-05-19 8:50 GMT+02:00 John Hearns via users <
>>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>:
>>>>
>>>>     Gabriele,   pleae run  'ibv_devinfo'
>>>>     It looks to me like you may have the physical interface cards in
>>>>     these systems, but you do not have the correct drivers or
>>>>     libraries loaded.
>>>>
>>>>     I have had similar messages when using Infiniband on x86 systems -
>>>>     which did not have libibverbs installed.
>>>>
>>>>
>>>>     On 19 May 2017 at 08:41, Gabriele Fatigati <g.fatig...@cineca.it
>>>>     <mailto:g.fatig...@cineca.it>> wrote:
>>>>
>>>>         Hi Gilles, using your command:
>>>>
>>>>         [openpower:88536] mca: base: components_register: registering
>>>>         framework pml components
>>>>         [openpower:88536] mca: base: components_register: found loaded
>>>>         component pami
>>>>         [openpower:88536] mca: base: components_register: component
>>>>         pami register function successful
>>>>         [openpower:88536] mca: base: components_open: opening pml
>>>>         components
>>>>         [openpower:88536] mca: base: components_open: found loaded
>>>>         component pami
>>>>         [openpower:88536] mca: base: components_open: component pami
>>>>         open function successful
>>>>         [openpower:88536] select: initializing pml component pami
>>>>         findActiveDevices Error
>>>>         We found no active IB device ports
>>>>         [openpower:88536] select: init returned failure for component
>>>> pami
>>>>         [openpower:88536] PML pami cannot be selected
>>>>         ------------------------------------------------------------
>>>> --------------
>>>>         No components were able to be opened in the pml framework.
>>>>
>>>>         This typically means that either no components of this type were
>>>>         installed, or none of the installed componnets can be loaded.
>>>>         Sometimes this means that shared libraries required by these
>>>>         components are unable to be found/loaded.
>>>>
>>>>           Host:      openpower
>>>>           Framework: pml
>>>>         ------------------------------------------------------------
>>>> --------------
>>>>
>>>>
>>>>         2017-05-19 7:03 GMT+02:00 Gilles Gouaillardet
>>>>         <gil...@rist.or.jp <mailto:gil...@rist.or.jp>>:
>>>>
>>>>             Gabriele,
>>>>
>>>>
>>>>             pml/pami is here, at least according to ompi_info
>>>>
>>>>
>>>>             can you update your mpirun command like this
>>>>
>>>>             mpirun --mca pml_base_verbose 100 ..
>>>>
>>>>
>>>>             and post the output ?
>>>>
>>>>
>>>>             Cheers,
>>>>
>>>>             Gilles
>>>>
>>>>             On 5/18/2017 10:41 PM, Gabriele Fatigati wrote:
>>>>
>>>>                 Hi Gilles, attached the requested info
>>>>
>>>>                 2017-05-18 15:04 GMT+02:00 Gilles Gouaillardet
>>>>                 <gilles.gouaillar...@gmail.com
>>>>                 <mailto:gilles.gouaillar...@gmail.com>
>>>>                 <mailto:gilles.gouaillar...@gmail.com
>>>>                 <mailto:gilles.gouaillar...@gmail.com>>>:
>>>>
>>>>                     Gabriele,
>>>>
>>>>                     can you
>>>>                     ompi_info --all | grep pml
>>>>
>>>>                     also, make sure there is nothing in your
>>>>                 environment pointing to
>>>>                     an other Open MPI install
>>>>                     for example
>>>>                     ldd a.out
>>>>                     should only point to IBM libraries
>>>>
>>>>                     Cheers,
>>>>
>>>>                     Gilles
>>>>
>>>>
>>>>                     On Thursday, May 18, 2017, Gabriele Fatigati
>>>>                 <g.fatig...@cineca.it <mailto:g.fatig...@cineca.it>
>>>>                     <mailto:g.fatig...@cineca.it
>>>>
>>>>                 <mailto:g.fatig...@cineca.it>>> wrote:
>>>>
>>>>                         Dear OpenMPI users and developers, I'm using
>>>>                 IBM Spectrum MPI
>>>>                         10.1.0 based on OpenMPI, so I hope there are
>>>>                 some MPI expert
>>>>                         can help me to solve the problem.
>>>>
>>>>                         When I run a simple Hello World MPI program, I
>>>>                 get the follow
>>>>                         error message:
>>>>
>>>>
>>>>                         A requested component was not found, or was
>>>>                 unable to be
>>>>                         opened.  This
>>>>                         means that this component is either not
>>>>                 installed or is unable
>>>>                         to be
>>>>                         used on your system (e.g., sometimes this
>>>>                 means that shared
>>>>                         libraries
>>>>                         that the component requires are unable to be
>>>>                 found/loaded).         Note that
>>>>                         Open MPI stopped checking at the first
>>>>                 component that it did
>>>>                         not find.
>>>>
>>>>                         Host:      openpower
>>>>                         Framework: pml
>>>>                         Component: pami
>>>>                 ------------------------------
>>>> --------------------------------------------
>>>>                 ------------------------------
>>>> --------------------------------------------
>>>>                         It looks like MPI_INIT failed for some reason;
>>>>                 your parallel
>>>>                         process is
>>>>                         likely to abort. There are many reasons that a
>>>>                 parallel
>>>>                         process can
>>>>                         fail during MPI_INIT; some of which are due to
>>>>                 configuration
>>>>                         or environment
>>>>                         problems.  This failure appears to be an
>>>>                 internal failure;
>>>>                         here's some
>>>>                         additional information (which may only be
>>>>                 relevant to an Open MPI
>>>>                         developer):
>>>>
>>>>                         mca_pml_base_open() failed
>>>>                           --> Returned "Not found" (-13) instead of
>>>>                 "Success" (0)
>>>>                 ------------------------------
>>>> --------------------------------------------
>>>>                         *** An error occurred in MPI_Init
>>>>                         *** on a NULL communicator
>>>>                         *** MPI_ERRORS_ARE_FATAL (processes in this
>>>>                 communicator will
>>>>                         now abort,
>>>>                         ***    and potentially your MPI job)
>>>>
>>>>                         My sysadmin used official IBM Spectrum
>>>>                 packages to install
>>>>                         MPI, so It's quite strange that there are some
>>>>                 components
>>>>                         missing (pami). Any help? Thanks
>>>>
>>>>
>>>>                         --         Ing. Gabriele Fatigati
>>>>
>>>>                         HPC specialist
>>>>
>>>>                         SuperComputing Applications and Innovation
>>>>                 Department
>>>>
>>>>                         Via Magnanelli 6/3, Casalecchio di Reno (BO)
>>>> Italy
>>>>
>>>>                 www.cineca.it <http://www.cineca.it>
>>>>                 <http://www.cineca.it>              Tel: +39
>>>>                 051 6171722 <tel:051%206171722> <tel:051%20617%201722>
>>>>
>>>>                         g.fatigati [AT] cineca.it <http://cineca.it>
>>>>                 <http://cineca.it>
>>>>
>>>>
>>>>                     _______________________________________________
>>>>                     users mailing list
>>>>                 users@lists.open-mpi.org
>>>>                 <mailto:users@lists.open-mpi.org>
>>>>                 <mailto:users@lists.open-mpi.org
>>>>                 <mailto:users@lists.open-mpi.org>>
>>>>                 https://rfd.newmexicoconsortiu
>>>> m.org/mailman/listinfo/users
>>>>                 <https://rfd.newmexicoconsorti
>>>> um.org/mailman/listinfo/users>
>>>>                                    <https://rfd.newmexicoconsort
>>>> ium.org/mailman/listinfo/users
>>>>                 <https://rfd.newmexicoconsorti
>>>> um.org/mailman/listinfo/users>>
>>>>
>>>>
>>>>
>>>>
>>>>                 --                 Ing. Gabriele Fatigati
>>>>
>>>>                 HPC specialist
>>>>
>>>>                 SuperComputing Applications and Innovation Department
>>>>
>>>>                 Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>>
>>>>                 www.cineca.it <http://www.cineca.it>
>>>>                 <http://www.cineca.it> Tel: +39 051 6171722
>>>>                 <tel:%2B39%20051%206171722>
>>>>
>>>>                 g.fatigati [AT] cineca.it <http://cineca.it>
>>>>                 <http://cineca.it>
>>>>
>>>>
>>>>                 _______________________________________________
>>>>                 users mailing list
>>>>                 users@lists.open-mpi.org <mailto:us...@lists.open-mpi.o
>>>> rg>
>>>>                 https://rfd.newmexicoconsortiu
>>>> m.org/mailman/listinfo/users
>>>>                 <https://rfd.newmexicoconsorti
>>>> um.org/mailman/listinfo/users>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         --         Ing. Gabriele Fatigati
>>>>
>>>>         HPC specialist
>>>>
>>>>         SuperComputing Applications and Innovation Department
>>>>
>>>>         Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>>
>>>>         www.cineca.it <http://www.cineca.it>                   Tel:
>>>>         +39 051 6171722 <tel:+39%20051%20617%201722>
>>>>
>>>>
>>>>         g.fatigati [AT] cineca.it <http://cineca.it>
>>>>
>>>>         _______________________________________________
>>>>         users mailing list
>>>>         users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>>         https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>>         <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>>
>>>>
>>>>
>>>>     _______________________________________________
>>>>     users mailing list
>>>>     users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>>>     https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>>     <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ing. Gabriele Fatigati
>>>>
>>>> HPC specialist
>>>>
>>>> SuperComputing Applications and Innovation Department
>>>>
>>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>>
>>>> www.cineca.it <http://www.cineca.it> Tel:   +39 051 6171722
>>>>
>>>> g.fatigati [AT] cineca.it <http://cineca.it>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users@lists.open-mpi.org
>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>
>>
>>
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it                    Tel:   +39 051 6171722
>> <051%20617%201722>
>>
>> g.fatigati [AT] cineca.it
>>
>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it                    Tel:   +39 051 6171722
> <+39%20051%20617%201722>
>
> g.fatigati [AT] cineca.it
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to