Hi Syed Ahsan Ali

To avoid any leftovers and further confusion,
I suggest that you delete completely the old installation directory.
Then start fresh from the configure step with the prefix pointing to
--prefix=/share/apps/openmpi-1.8.4_gcc-4.9.2

I hope this helps,
Gus Correa

On 02/27/2015 12:11 PM, Syed Ahsan Ali wrote:
Hi Gus

Thanks for prompt response. Well judged, I compiled with /export/apps
prefix so that is most probably the reason. I'll check and update you.

Best wishes
Ahsan

On Fri, Feb 27, 2015 at 10:07 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:
Hi Syed

This really sounds as a problem specific to Rocks Clusters,
not an issue with Open MPI.
A confusion related to mount points, and soft links used by Rocks.

I haven't used Rocks Clusters in a while,
and I don't remember the details anymore, so please take my
suggestions with a grain of salt, and check them out
before committing to them

Which --prefix did you use when you configured Open MPI?
My suggestion is that you don't use "/export/apps" as a prefix
(and this goes to any application that you install).
but instead use a /share/apps subdirectory, something like:

--prefix=/share/apps/openmpi-1.8.4_gcc-4.9.2

This is because /export/apps is just a mount point on the
frontend/head node, whereas /share/apps is a mount point
across all nodes in the cluster (and, IIRR, a soft link on the
head node).

My recollection is that the Rocks documentation was obscure
about this, not making clear the difference between
/export/apps and /share/apps.

Issuing the Rocks commands:
"tentakel 'ls -d /export/apps'"
"tentakel 'ls -d /share/apps'"
may show something useful.

I hope this helps,
Gus Correa


On 02/27/2015 11:47 AM, Syed Ahsan Ali wrote:

I am trying to run openmpi application on my cluster.  But the mpirun
fails, simple hostname command gives this error

[pmdtest@hpc bin]$ mpirun --host compute-0-0 hostname
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
      opal_init:startup:internal-failure
But I couldn't open the help file:

/export/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-opal-runtime.txt:
No such file or directory.  Sorry!
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
      orte_init:startup:internal-failure
But I couldn't open the help file:
      /export/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-orte-runtime:
No such file or directory.  Sorry!
--------------------------------------------------------------------------
[compute-0-0.local:03410] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in
file orted/orted_main.c at line 369
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.

I am using Environment modules to load OpenMPI 1.8.4 and PATH and
LD_LIBRARY_PATH points to same openmpi on nodes

[pmdtest@hpc bin]$ which mpirun
/share/apps/openmpi-1.8.4_gcc-4.9.2/bin/mpirun
[pmdtest@hpc bin]$ ssh compute-0-0
Last login: Sat Feb 28 02:15:50 2015 from hpc.local
Rocks Compute Node
Rocks 6.1.1 (Sand Boa)
Profile built 01:53 28-Feb-2015
Kickstarted 01:59 28-Feb-2015
[pmdtest@compute-0-0 ~]$ which mpirun
/share/apps/openmpi-1.8.4_gcc-4.9.2/bin/mpirun

The only this I notice important is that in the error it is referring to

/export/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-opal-runtime.txt:

While it should have shown
/share/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-opal-runtime.txt:
which is the path compute nodes see.

Please help!
Ahsan
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/02/26411.php


_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/02/26412.php




Reply via email to