Hi Gilles

Thank you for your prompt response.

Here is some information about the system

Ubuntu 16.04 server
Linux-4.4.0-75-generic-x86_64-with-Ubuntu-16.04-xenial

On HP PROLIANT DL320R05 Generation 5, 4GB RAM,  4x120GB raid-1 HDD,  2
ethernet ports 10/100/1000
HP StorageWorks 70 Modular Smart Array with 14x120GB HDD (RAID-5) 

44 HP Proliant BL465c server blade, double AMD Opteron Model 2218(2.6GHz,
2MB, 95W), 4 GB RAM, 2 NC370i Multifunction Gigabit Servers Adapters, 120GB 

User's area is shared with the nodes.

ssh and torque 6.0.2 services works fine

Torque and openmpi 2.1.0 are installed from tarball.   configure
--prefix=/storage/exp_soft/tuc  is used for the deployment of openmpi 2.1.0.
After make and make install binaries, lib and include files of openmpi2.1.0
are located under /storage/exp_soft/tuc .

/storage is a shared file system for all the nodes of the cluster

$PATH:
        /storage/exp_soft/tuc/bin
        /storage/exp_soft/tuc/sbin
        /storage/exp_soft/tuc/torque/bin
        /storage/exp_soft/tuc/torque/sbin
        /usr/local/sbin
        /usr/local/bin
        /usr/sbin
        /usr/bin
        /sbin
        /bin
        /snap/bin


LD_LIBRARY_PATH=/storage/exp_soft/tuc/lib

C_INCLUDE_PATH=/storage/exp_soft/tuc/include

I use also jupyterhub (with cluster tab enabled) as a user interface to the
cluster. After the installation of python and some dependencies???? mpich
and openmpi are also installed in the system directories.

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------------
mpirun --allow-run-as-root --mca shmem_base_verbose 100 ...

[se01.grid.tuc.gr:19607] mca: base: components_register: registering
framework shmem components
[se01.grid.tuc.gr:19607] mca: base: components_register: found loaded
component sysv
[se01.grid.tuc.gr:19607] mca: base: components_register: component sysv
register function successful
[se01.grid.tuc.gr:19607] mca: base: components_register: found loaded
component posix
[se01.grid.tuc.gr:19607] mca: base: components_register: component posix
register function successful
[se01.grid.tuc.gr:19607] mca: base: components_register: found loaded
component mmap
[se01.grid.tuc.gr:19607] mca: base: components_register: component mmap
register function successful
[se01.grid.tuc.gr:19607] mca: base: components_open: opening shmem
components
[se01.grid.tuc.gr:19607] mca: base: components_open: found loaded component
sysv
[se01.grid.tuc.gr:19607] mca: base: components_open: component sysv open
function successful
[se01.grid.tuc.gr:19607] mca: base: components_open: found loaded component
posix
[se01.grid.tuc.gr:19607] mca: base: components_open: component posix open
function successful
[se01.grid.tuc.gr:19607] mca: base: components_open: found loaded component
mmap
[se01.grid.tuc.gr:19607] mca: base: components_open: component mmap open
function successful
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: Auto-selecting shmem
components
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Querying
component (run-time) [sysv]
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Query of
component [sysv] set priority to 30
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Querying
component (run-time) [posix]
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Query of
component [posix] set priority to 40
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Querying
component (run-time) [mmap]
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Query of
component [mmap] set priority to 50
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Selected
component [mmap]
[se01.grid.tuc.gr:19607] mca: base: close: unloading component sysv
[se01.grid.tuc.gr:19607] mca: base: close: unloading component posix
[se01.grid.tuc.gr:19607] shmem: base: best_runnable_component_name:
Searching for best runnable component.
[se01.grid.tuc.gr:19607] shmem: base: best_runnable_component_name: Found
best runnable component: (mmap).
--------------------------------------------------------------------------
mpirun was unable to find the specified executable file, and therefore
did not launch the job.  This error was first reported for process
rank 0; it may have occurred for other processes as well.

NOTE: A common cause for this error is misspelling a mpirun command
      line parameter option (remember that mpirun interprets the first
      unrecognized command line token as the executable).

Node:       se01
Executable: ...
--------------------------------------------------------------------------
2 total processes failed to start
[se01.grid.tuc.gr:19607] mca: base: close: component mmap closed
[se01.grid.tuc.gr:19607] mca: base: close: unloading component mmap


jb


-----Original Message-----
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of
gil...@rist.or.jp
Sent: Monday, May 15, 2017 1:47 PM
To: Open MPI Users <users@lists.open-mpi.org>
Subject: Re: [OMPI users] (no subject)

Ioannis,

### What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git
branch name and hash, etc.)



### Describe how Open MPI was installed (e.g., from a source/
distribution tarball, from a git clone, from an operating system 
distribution package, etc.)



### Please describe the system on which you are running

* Operating system/version: 
* Computer hardware: 
* Network type: 

also, what if you

mpirun --mca shmem_base_verbose 100 ...


Cheers,

Gilles
----- Original Message -----
> Hi
> 
> I am trying to run the following simple demo to a cluster of two nodes
> 
> ----------------------------------------------------------------------
------------------------------------
> #include <mpi.h>
> #include <stdio.h>
> 
> int main(int argc, char** argv) {
>      MPI_Init(NULL, NULL);
> 
>      int world_size;
>      MPI_Comm_size(MPI_COMM_WORLD, &world_size);
> 
>      int world_rank;
>      MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
> 
>      char processor_name[MPI_MAX_PROCESSOR_NAME];
>      int name_len;
>      MPI_Get_processor_name(processor_name, &name_len);
> 
>      printf("Hello world from processor %s, rank %d"   " out of %d 
> processors\n",  processor_name, world_rank, world_size);
> 
>      MPI_Finalize();
> }
> ----------------------------------------------------------------------
---------------------------
> 
> i get always the message
> 
> ----------------------------------------------------------------------
--------------------------
> It looks like opal_init failed for some reason; your parallel process 
is
> likely to abort.  There are many reasons that a parallel process can
> fail during opal_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>    opal_shmem_base_select failed
>    --> Returned value -1 instead of OPAL_SUCCESS
> ----------------------------------------------------------------------
----------------------------
> 
> any hint?
> 
> Ioannis Botsis
> 
> 
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to