Hi All,

I'm having problems withe openmpi 1.4.1 and am receiving the following error message when I try to run a test job.

[root@hydra ~]# mpirun -n 2 --prefix `dirname $MPILIBDIR` -v -show-progress -machinefile ./nodes.to.use -pernode ./dml_test
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_paffinity_base_select failed
  --> Returned value -13 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[hydra:10645] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 77 [hydra:10645] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orterun.c at line 541

I have built openmpi with the following configure options

./configure --with-gm=/usr/local/gm --prefix=/opt/apps/system/openmpi/1.4.1/intel

and it appears to build correctly, finds the right libraries and generally doesn't have too much of a problem.

This was built on

Linux hydra 2.6.18-164.el5 #1 SMP Thu Sep 3 03:33:56 EDT 2009 i686 i686 i386 GNU/Linux

and after reading the docs, trawling the archives, I can't find much that resembles the errors noted above.

Does anybody have any idea or pointers on where to look or what to debug?

Thanks and regards
David

--

David Logan
eResearch SA, ARCS Grid Administrator
Level 1, School of Physics and Chemistry
North Terrace, Adelaide, 5005

(W) 08 8303 7301
(M) 0458 631 117

Reply via email to