Hi Syed This really sounds as a problem specific to Rocks Clusters, not an issue with Open MPI. A confusion related to mount points, and soft links used by Rocks.
I haven't used Rocks Clusters in a while, and I don't remember the details anymore, so please take my suggestions with a grain of salt, and check them out before committing to them Which --prefix did you use when you configured Open MPI? My suggestion is that you don't use "/export/apps" as a prefix (and this goes to any application that you install). but instead use a /share/apps subdirectory, something like: --prefix=/share/apps/openmpi-1.8.4_gcc-4.9.2 This is because /export/apps is just a mount point on the frontend/head node, whereas /share/apps is a mount point across all nodes in the cluster (and, IIRR, a soft link on the head node). My recollection is that the Rocks documentation was obscure about this, not making clear the difference between /export/apps and /share/apps. Issuing the Rocks commands: "tentakel 'ls -d /export/apps'" "tentakel 'ls -d /share/apps'" may show something useful. I hope this helps, Gus Correa On 02/27/2015 11:47 AM, Syed Ahsan Ali wrote:
I am trying to run openmpi application on my cluster. But the mpirun fails, simple hostname command gives this error [pmdtest@hpc bin]$ mpirun --host compute-0-0 hostname -------------------------------------------------------------------------- Sorry! You were supposed to get help about: opal_init:startup:internal-failure But I couldn't open the help file: /export/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-opal-runtime.txt: No such file or directory. Sorry! -------------------------------------------------------------------------- -------------------------------------------------------------------------- Sorry! You were supposed to get help about: orte_init:startup:internal-failure But I couldn't open the help file: /export/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-orte-runtime: No such file or directory. Sorry! -------------------------------------------------------------------------- [compute-0-0.local:03410] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orted/orted_main.c at line 369 -------------------------------------------------------------------------- ORTE was unable to reliably start one or more daemons. I am using Environment modules to load OpenMPI 1.8.4 and PATH and LD_LIBRARY_PATH points to same openmpi on nodes [pmdtest@hpc bin]$ which mpirun /share/apps/openmpi-1.8.4_gcc-4.9.2/bin/mpirun [pmdtest@hpc bin]$ ssh compute-0-0 Last login: Sat Feb 28 02:15:50 2015 from hpc.local Rocks Compute Node Rocks 6.1.1 (Sand Boa) Profile built 01:53 28-Feb-2015 Kickstarted 01:59 28-Feb-2015 [pmdtest@compute-0-0 ~]$ which mpirun /share/apps/openmpi-1.8.4_gcc-4.9.2/bin/mpirun The only this I notice important is that in the error it is referring to /export/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-opal-runtime.txt: While it should have shown /share/apps/openmpi-1.8.4_gcc-4.9.2/share/openmpi/help-opal-runtime.txt: which is the path compute nodes see. Please help! Ahsan _______________________________________________ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/02/26411.php