Hello,

I would like to run an OpenMPI application on one node and since I think it would be better performance wise I want it to use shared memory for communication and not tcp. Is it possible to use shared memory not only for MPI communication but also for control messages and other similar inner MPI related communication? (so no tcp communication whatsoever is used).

I came up with following parameters but I am receiving an error when I use it:
mpirun --host localhost --mca btl sm,self --mca oob ^tcp -n 2 hello

It's running a simple hello world application. I know I don't have to use the host parameter since by default it will run on localhost but just to be on the safe side I included that too. I ask btl to use sm and self (I guess "self" is compulsory) and instruct oob to not use tcp (per the last lines in http://www.open-mpi.org/faq/?category=tcp#tcp-selection ). Isn't this correct?

Here's the exact error:

# mpirun --host localhost --mca btl sm,self --mca oob ^tcp -n 2 hello
[myhost:08491] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

 orte_rml_base_select failed
 --> Returned value -13 instead of ORTE_SUCCESS

--------------------------------------------------------------------------
[peanutbutter:08491] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_system_init.c at line 42 [peanutbutter:08491] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 52
--------------------------------------------------------------------------
Open RTE was unable to initialize properly.  The error occured while
attempting to orte_init().  Returned value -13 instead of ORTE_SUCCESS.
--------------------------------------------------------------------------

Reply via email to