Try configuring --without-psm That should solve the problem. We are probably picking up that you have PSM libraries on the machine, but it looks like you aren't actually running it.
And yes - it should gracefully disable itself. You might check the 1.6 series to see if it behaves better - if not, we should fix it. On Nov 2, 2012, at 8:49 AM, "Blosch, Edwin L" <edwin.l.blo...@lmco.com> wrote: > I am getting a problem where something called "PSM" is failing to start and > that in turn is preventing my job from running. Command and output are > below. I would like to understand what's going on. Apparently this version > of OpenMPI decided to build itself with support for PSM, but if it's not > available, why fail if other transports are available? Also, in my command I > think I've told OpenMPI not to use anything but self and sm, so why would it > try to use PSM? > > Thanks in advance for any help... > > user@machinename:~> /usr/mpi/intel/openmpi-1.4.3/bin/ompi_info -all | grep psm > MCA mtl: psm (MCA v2.0, API v2.0, Component v1.4.3) > MCA mtl: parameter "mtl_psm_connect_timeout" (current value: > "180", data source: default value) > MCA mtl: parameter "mtl_psm_debug" (current value: "1", data > source: default value) > MCA mtl: parameter "mtl_psm_ib_unit" (current value: "-1", > data source: default value) > MCA mtl: parameter "mtl_psm_ib_port" (current value: "0", > data source: default value) > MCA mtl: parameter "mtl_psm_ib_service_level" (current value: > "0", data source: default value) > MCA mtl: parameter "mtl_psm_ib_pkey" (current value: "32767", > data source: default value) > MCA mtl: parameter "mtl_psm_priority" (current value: "0", > data source: default value) > > Here is my command: > > /usr/mpi/intel/openmpi-1.4.3/bin/mpirun -n 1 --mca btl_base_verbose 30 --mca > btl self,sm /release/cfd/simgrid/P_OPT.LINUX64 > > and here is the output: > > [machinename:01124] mca: base: components_open: Looking for btl components > [machinename:01124] mca: base: components_open: opening btl components > [machinename:01124] mca: base: components_open: found loaded component self > [machinename:01124] mca: base: components_open: component self has no > register function > [machinename:01124] mca: base: components_open: component self open function > successful > [machinename:01124] mca: base: components_open: found loaded component sm > [machinename:01124] mca: base: components_open: component sm has no register > function > [machinename:01124] mca: base: components_open: component sm open function > successful > machinename.1124ipath_userinit: assign_context command failed: Network is down > machinename.1124can't open /dev/ipath, network down (err=26) > -------------------------------------------------------------------------- > PSM was unable to open an endpoint. Please make sure that the network link is > active on the node and the hardware is functioning. > > Error: Could not detect network connectivity > -------------------------------------------------------------------------- > [machinename:01124] mca: base: close: component self closed > [machinename:01124] mca: base: close: unloading component self > [machinename:01124] mca: base: close: component sm closed > [machinename:01124] mca: base: close: unloading component sm > -------------------------------------------------------------------------- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > PML add procs failed > --> Returned "Error" (-1) instead of "Success" (0) > -------------------------------------------------------------------------- > *** The MPI_Init() function was called before MPI_INIT was invoked. > *** This is disallowed by the MPI standard. > *** Your MPI job will now abort. > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users