Add —mca btl self,vader -Nathan
> On May 19, 2017, at 1:23 AM, Gabriele Fatigati <g.fatig...@cineca.it> wrote: > > Oh no, by using two procs: > > > findActiveDevices Error > We found no active IB device ports > findActiveDevices Error > We found no active IB device ports > -------------------------------------------------------------------------- > At least one pair of MPI processes are unable to reach each other for > MPI communications. This means that no Open MPI device has indicated > that it can be used to communicate between these processes. This is > an error; Open MPI requires that all MPI processes be able to reach > each other. This error can sometimes be the result of forgetting to > specify the "self" BTL. > > Process 1 ([[12380,1],0]) is on host: openpower > Process 2 ([[12380,1],1]) is on host: openpower > BTLs attempted: self > > Your MPI job is now going to abort; sorry. > -------------------------------------------------------------------------- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > -------------------------------------------------------------------------- > MPI_INIT has failed because at least one MPI process is unreachable > from another. This *usually* means that an underlying communication > plugin -- such as a BTL or an MTL -- has either not loaded or not > allowed itself to be used. Your MPI job will now abort. > > You may wish to try to narrow down the problem; > * Check the output of ompi_info to see which BTL/MTL plugins are > available. > * Run your application with MPI_THREAD_SINGLE. > * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose, > if using MTL-based communications) to see exactly which > communication plugins were considered and/or discarded. > -------------------------------------------------------------------------- > [openpower:88867] 1 more process has sent help message help-mca-bml-r2.txt / > unreachable proc > [openpower:88867] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages > [openpower:88867] 1 more process has sent help message help-mpi-runtime.txt / > mpi_init:startup:pml-add-procs-fail > > > > > > 2017-05-19 9:22 GMT+02:00 Gabriele Fatigati <g.fatig...@cineca.it>: > Hi GIlles, > > using your command with one MPI procs I get: > > findActiveDevices Error > We found no active IB device ports > Hello world from rank 0 out of 1 processors > > So it seems to work apart the error message. > > > 2017-05-19 9:10 GMT+02:00 Gilles Gouaillardet <gil...@rist.or.jp>: > Gabriele, > > > so it seems pml/pami assumes there is an infiniband card available (!) > > i guess IBM folks will comment on that shortly. > > > meanwhile, you do not need pami since you are running on a single node > > mpirun --mca pml ^pami ... > > should do the trick > > (if it does not work, can run and post the logs) > > mpirun --mca pml ^pami --mca pml_base_verbose 100 ... > > > Cheers, > > > Gilles > > > On 5/19/2017 4:01 PM, Gabriele Fatigati wrote: > Hi John, > Infiniband is not used, there is a single node on this machine. > > 2017-05-19 8:50 GMT+02:00 John Hearns via users <users@lists.open-mpi.org > <mailto:users@lists.open-mpi.org>>: > > Gabriele, pleae run 'ibv_devinfo' > It looks to me like you may have the physical interface cards in > these systems, but you do not have the correct drivers or > libraries loaded. > > I have had similar messages when using Infiniband on x86 systems - > which did not have libibverbs installed. > > > On 19 May 2017 at 08:41, Gabriele Fatigati <g.fatig...@cineca.it > <mailto:g.fatig...@cineca.it>> wrote: > > Hi Gilles, using your command: > > [openpower:88536] mca: base: components_register: registering > framework pml components > [openpower:88536] mca: base: components_register: found loaded > component pami > [openpower:88536] mca: base: components_register: component > pami register function successful > [openpower:88536] mca: base: components_open: opening pml > components > [openpower:88536] mca: base: components_open: found loaded > component pami > [openpower:88536] mca: base: components_open: component pami > open function successful > [openpower:88536] select: initializing pml component pami > findActiveDevices Error > We found no active IB device ports > [openpower:88536] select: init returned failure for component pami > [openpower:88536] PML pami cannot be selected > > -------------------------------------------------------------------------- > No components were able to be opened in the pml framework. > > This typically means that either no components of this type were > installed, or none of the installed componnets can be loaded. > Sometimes this means that shared libraries required by these > components are unable to be found/loaded. > > Host: openpower > Framework: pml > > -------------------------------------------------------------------------- > > > 2017-05-19 7:03 GMT+02:00 Gilles Gouaillardet > <gil...@rist.or.jp <mailto:gil...@rist.or.jp>>: > > Gabriele, > > > pml/pami is here, at least according to ompi_info > > > can you update your mpirun command like this > > mpirun --mca pml_base_verbose 100 .. > > > and post the output ? > > > Cheers, > > Gilles > > On 5/18/2017 10:41 PM, Gabriele Fatigati wrote: > > Hi Gilles, attached the requested info > > 2017-05-18 15:04 GMT+02:00 Gilles Gouaillardet > <gilles.gouaillar...@gmail.com > <mailto:gilles.gouaillar...@gmail.com> > <mailto:gilles.gouaillar...@gmail.com > <mailto:gilles.gouaillar...@gmail.com>>>: > > Gabriele, > > can you > ompi_info --all | grep pml > > also, make sure there is nothing in your > environment pointing to > an other Open MPI install > for example > ldd a.out > should only point to IBM libraries > > Cheers, > > Gilles > > > On Thursday, May 18, 2017, Gabriele Fatigati > <g.fatig...@cineca.it <mailto:g.fatig...@cineca.it> > <mailto:g.fatig...@cineca.it > > <mailto:g.fatig...@cineca.it>>> wrote: > > Dear OpenMPI users and developers, I'm using > IBM Spectrum MPI > 10.1.0 based on OpenMPI, so I hope there are > some MPI expert > can help me to solve the problem. > > When I run a simple Hello World MPI program, I > get the follow > error message: > > > A requested component was not found, or was > unable to be > opened. This > means that this component is either not > installed or is unable > to be > used on your system (e.g., sometimes this > means that shared > libraries > that the component requires are unable to be > found/loaded). Note that > Open MPI stopped checking at the first > component that it did > not find. > > Host: openpower > Framework: pml > Component: pami > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > It looks like MPI_INIT failed for some reason; > your parallel > process is > likely to abort. There are many reasons that a > parallel > process can > fail during MPI_INIT; some of which are due to > configuration > or environment > problems. This failure appears to be an > internal failure; > here's some > additional information (which may only be > relevant to an Open MPI > developer): > > mca_pml_base_open() failed > --> Returned "Not found" (-13) instead of > "Success" (0) > > -------------------------------------------------------------------------- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this > communicator will > now abort, > *** and potentially your MPI job) > > My sysadmin used official IBM Spectrum > packages to install > MPI, so It's quite strange that there are some > components > missing (pami). Any help? Thanks > > > -- Ing. Gabriele Fatigati > > HPC specialist > > SuperComputing Applications and Innovation > Department > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > www.cineca.it <http://www.cineca.it> > <http://www.cineca.it> Tel: +39 > 051 6171722 <tel:051%206171722> <tel:051%20617%201722> > > g.fatigati [AT] cineca.it <http://cineca.it> > <http://cineca.it> > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > <mailto:users@lists.open-mpi.org> > <mailto:users@lists.open-mpi.org > <mailto:users@lists.open-mpi.org>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> > > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>> > > > > > -- Ing. Gabriele Fatigati > > HPC specialist > > SuperComputing Applications and Innovation Department > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > www.cineca.it <http://www.cineca.it> > <http://www.cineca.it> Tel: +39 051 6171722 > <tel:%2B39%20051%206171722> > > g.fatigati [AT] cineca.it <http://cineca.it> > <http://cineca.it> > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> > > > > > > -- Ing. Gabriele Fatigati > > HPC specialist > > SuperComputing Applications and Innovation Department > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > www.cineca.it <http://www.cineca.it> Tel: > +39 051 6171722 <tel:+39%20051%20617%201722> > > > g.fatigati [AT] cineca.it <http://cineca.it> > > _______________________________________________ > users mailing list > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> > > > > > -- > Ing. Gabriele Fatigati > > HPC specialist > > SuperComputing Applications and Innovation Department > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > www.cineca.it <http://www.cineca.it> Tel: +39 051 6171722 > > g.fatigati [AT] cineca.it <http://cineca.it> > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > -- > Ing. Gabriele Fatigati > > HPC specialist > > SuperComputing Applications and Innovation Department > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > www.cineca.it Tel: +39 051 6171722 > > g.fatigati [AT] cineca.it > > > > -- > Ing. Gabriele Fatigati > > HPC specialist > > SuperComputing Applications and Innovation Department > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > www.cineca.it Tel: +39 051 6171722 > > g.fatigati [AT] cineca.it > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users