Yeah, this was built with a bunch of stuff you don't want. Are you trying to just run with TCP and not Infiniband? If so, then you want
mpirun -mca btl tcp,sm,self .... and the problem should go away. On Jun 30, 2014, at 11:06 AM, Jeffrey A Cummings <jeffrey.a.cummi...@aero.org> wrote: > Output from ompi_info: > > > Package: Open MPI root@centos-6-5-x86-64-1.localdomain > Distribution > Open MPI: 1.6.2 > Open MPI SVN revision: r27344 > Open MPI release date: Sep 18, 2012 > Open RTE: 1.6.2 > Open RTE SVN revision: r27344 > Open RTE release date: Sep 18, 2012 > OPAL: 1.6.2 > OPAL SVN revision: r27344 > OPAL release date: Sep 18, 2012 > MPI API: 2.1 > Ident string: 1.6.2 > Prefix: /opt/openmpi > Configured architecture: x86_64-unknown-linux-gnu > Configure host: centos-6-5-x86-64-1.localdomain > Configured by: root > Configured on: Mon Apr 14 02:23:27 PDT 2014 > Configure host: centos-6-5-x86-64-1.localdomain > Built by: root > Built on: Mon Apr 14 02:33:37 PDT 2014 > Built host: centos-6-5-x86-64-1.localdomain > C bindings: yes > C++ bindings: yes > Fortran77 bindings: yes (all) > Fortran90 bindings: yes > Fortran90 bindings size: small > C compiler: gcc > C compiler absolute: /usr/bin/gcc > C compiler family name: GNU > C compiler version: 4.4.7 > C++ compiler: g++ > C++ compiler absolute: /usr/bin/g++ > Fortran77 compiler: gfortran > Fortran77 compiler abs: /usr/bin/gfortran > Fortran90 compiler: gfortran > Fortran90 compiler abs: /usr/bin/gfortran > C profiling: yes > C++ profiling: yes > Fortran77 profiling: yes > Fortran90 profiling: yes > C++ exceptions: no > Thread support: posix (MPI_THREAD_MULTIPLE: no, progress: no) > Sparse Groups: no > Internal debug support: no > MPI interface warnings: no > MPI parameter check: runtime > Memory profiling support: no > Memory debugging support: no > libltdl support: yes > Heterogeneous support: no > mpirun default --prefix: no > MPI I/O support: yes > MPI_WTIME support: gettimeofday > Symbol vis. support: yes > Host topology support: yes > MPI extensions: affinity example > FT Checkpoint support: no (checkpoint thread: no) > VampirTrace support: yes > MPI_MAX_PROCESSOR_NAME: 256 > MPI_MAX_ERROR_STRING: 256 > MPI_MAX_OBJECT_NAME: 64 > MPI_MAX_INFO_KEY: 36 > MPI_MAX_INFO_VAL: 256 > MPI_MAX_PORT_NAME: 1024 > MPI_MAX_DATAREP_STRING: 128 > MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.6.2) > MCA memory: linux (MCA v2.0, API v2.0, Component v1.6.2) > MCA paffinity: hwloc (MCA v2.0, API v2.0, Component v1.6.2) > MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.6.2) > MCA carto: file (MCA v2.0, API v2.0, Component v1.6.2) > MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.6.2) > MCA shmem: posix (MCA v2.0, API v2.0, Component v1.6.2) > MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.6.2) > MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.6.2) > MCA maffinity: hwloc (MCA v2.0, API v2.0, Component v1.6.2) > MCA timer: linux (MCA v2.0, API v2.0, Component v1.6.2) > MCA installdirs: env (MCA v2.0, API v2.0, Component v1.6.2) > MCA installdirs: config (MCA v2.0, API v2.0, Component v1.6.2) > MCA sysinfo: linux (MCA v2.0, API v2.0, Component v1.6.2) > MCA hwloc: hwloc132 (MCA v2.0, API v2.0, Component v1.6.2) > MCA dpm: orte (MCA v2.0, API v2.0, Component v1.6.2) > MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.6.2) > MCA allocator: basic (MCA v2.0, API v2.0, Component v1.6.2) > MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.6.2) > MCA coll: basic (MCA v2.0, API v2.0, Component v1.6.2) > MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.6.2) > MCA coll: inter (MCA v2.0, API v2.0, Component v1.6.2) > MCA coll: self (MCA v2.0, API v2.0, Component v1.6.2) > MCA coll: sm (MCA v2.0, API v2.0, Component v1.6.2) > MCA coll: sync (MCA v2.0, API v2.0, Component v1.6.2) > MCA coll: tuned (MCA v2.0, API v2.0, Component v1.6.2) > MCA io: romio (MCA v2.0, API v2.0, Component v1.6.2) > MCA mpool: fake (MCA v2.0, API v2.0, Component v1.6.2) > MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.6.2) > MCA mpool: sm (MCA v2.0, API v2.0, Component v1.6.2) > MCA pml: bfo (MCA v2.0, API v2.0, Component v1.6.2) > MCA pml: csum (MCA v2.0, API v2.0, Component v1.6.2) > MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.6.2) > MCA pml: v (MCA v2.0, API v2.0, Component v1.6.2) > MCA bml: r2 (MCA v2.0, API v2.0, Component v1.6.2) > MCA rcache: vma (MCA v2.0, API v2.0, Component v1.6.2) > MCA btl: self (MCA v2.0, API v2.0, Component v1.6.2) > MCA btl: ofud (MCA v2.0, API v2.0, Component v1.6.2) > MCA btl: openib (MCA v2.0, API v2.0, Component v1.6.2) > MCA btl: sm (MCA v2.0, API v2.0, Component v1.6.2) > MCA btl: tcp (MCA v2.0, API v2.0, Component v1.6.2) > MCA btl: udapl (MCA v2.0, API v2.0, Component v1.6.2) > MCA topo: unity (MCA v2.0, API v2.0, Component v1.6.2) > MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.6.2) > MCA osc: rdma (MCA v2.0, API v2.0, Component v1.6.2) > MCA iof: hnp (MCA v2.0, API v2.0, Component v1.6.2) > MCA iof: orted (MCA v2.0, API v2.0, Component v1.6.2) > MCA iof: tool (MCA v2.0, API v2.0, Component v1.6.2) > MCA oob: tcp (MCA v2.0, API v2.0, Component v1.6.2) > MCA odls: default (MCA v2.0, API v2.0, Component v1.6.2) > MCA ras: cm (MCA v2.0, API v2.0, Component v1.6.2) > MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.2) > MCA ras: loadleveler (MCA v2.0, API v2.0, Component v1.6.2) > MCA ras: slurm (MCA v2.0, API v2.0, Component v1.6.2) > MCA rmaps: load_balance (MCA v2.0, API v2.0, Component v1.6.2) > MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.6.2) > MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.6.2) > MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.6.2) > MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.6.2) > MCA rmaps: topo (MCA v2.0, API v2.0, Component v1.6.2) > MCA rml: oob (MCA v2.0, API v2.0, Component v1.6.2) > MCA routed: binomial (MCA v2.0, API v2.0, Component v1.6.2) > MCA routed: cm (MCA v2.0, API v2.0, Component v1.6.2) > MCA routed: direct (MCA v2.0, API v2.0, Component v1.6.2) > MCA routed: linear (MCA v2.0, API v2.0, Component v1.6.2) > MCA routed: radix (MCA v2.0, API v2.0, Component v1.6.2) > MCA routed: slave (MCA v2.0, API v2.0, Component v1.6.2) > MCA plm: rsh (MCA v2.0, API v2.0, Component v1.6.2) > MCA plm: slurm (MCA v2.0, API v2.0, Component v1.6.2) > MCA filem: rsh (MCA v2.0, API v2.0, Component v1.6.2) > MCA errmgr: default (MCA v2.0, API v2.0, Component v1.6.2) > MCA ess: env (MCA v2.0, API v2.0, Component v1.6.2) > MCA ess: hnp (MCA v2.0, API v2.0, Component v1.6.2) > MCA ess: singleton (MCA v2.0, API v2.0, Component v1.6.2) > MCA ess: slave (MCA v2.0, API v2.0, Component v1.6.2) > MCA ess: slurm (MCA v2.0, API v2.0, Component v1.6.2) > MCA ess: slurmd (MCA v2.0, API v2.0, Component v1.6.2) > MCA ess: tool (MCA v2.0, API v2.0, Component v1.6.2) > MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.6.2) > MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.6.2) > MCA notifier: command (MCA v2.0, API v1.0, Component v1.6.2) > MCA notifier: syslog (MCA v2.0, API v1.0, Component v1.6.2) > > > Jeffrey A. Cummings > Engineering Specialist > Performance Modeling and Analysis Department > Systems Analysis and Simulation Subdivision > Systems Engineering Division > Engineering and Technology Group > The Aerospace Corporation > 571-307-4220 > jeffrey.a.cummi...@aero.org > > > > From: Ralph Castain <r...@open-mpi.org> > To: Open MPI Users <us...@open-mpi.org>, > Date: 06/27/2014 03:42 PM > Subject: Re: [OMPI users] Problem moving from 1.4 to 1.6 > Sent by: "users" <users-boun...@open-mpi.org> > > > > Let me steer you on a different course. Can you run "ompi_info" and paste the > output here? It looks to me like someone installed a version that includes > uDAPL support, so you may have to disable some additional things to get it to > run. > > > On Jun 27, 2014, at 9:53 AM, Jeffrey A Cummings <jeffrey.a.cummi...@aero.org> > wrote: > > We have recently upgraded our cluster to a version of Linux which comes with > openMPI version 1.6.2. > > An application which ran previously (using some version of 1.4) now errors > out with the following messages: > > librdmacm: Fatal: no RDMA devices found > librdmacm: Fatal: no RDMA devices found > librdmacm: Fatal: no RDMA devices found > > -------------------------------------------------------------------------- > WARNING: Failed to open "OpenIB-cma" [DAT_INTERNAL_ERROR:]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > > -------------------------------------------------------------------------- > [tupile:25363] 2 more processes have sent help message > help-mpi-btl-udapl.txt / dat_ia_open fail > [tupile:25363] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > > The mpirun command line contains the argument '--mca btl ^openib', which I > thought told mpi to not look for the ib interface. > > Can anyone suggest what the problem might be? Did the relevant syntax change > between versions 1.4 and 1.6? > > > Jeffrey A. Cummings > Engineering Specialist > Performance Modeling and Analysis Department > Systems Analysis and Simulation Subdivision > Systems Engineering Division > Engineering and Technology Group > The Aerospace Corporation > 571-307-4220 > jeffrey.a.cummi...@aero.org > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24721.php > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24727.php > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24731.php