Can you please rebuild OMPI with -enable-debug in the configure cmd? It will 
let us see more error output


> On Apr 21, 2016, at 8:52 AM, Siegmar Gross 
> <siegmar.gr...@informatik.hs-fulda.de> wrote:
> 
> Hi Ralph,
> 
> I don't see any additional information.
> 
> tyr hello_1 108 mpiexec -np 4 --host tyr,sunpc1,linpc1,ruester -mca 
> mca_base_component_show_load_errors 1 hello_1_mpi
> [tyr.informatik.hs-fulda.de:06211] [[48741,0],0] ORTE_ERROR_LOG: Not found in 
> file 
> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>  at line 638
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>  opal_pmix_base_select failed
>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
> 
> tyr hello_1 109 mpiexec -np 4 --host tyr,sunpc1,linpc1,ruester -mca 
> mca_base_component_show_load_errors 1 -mca pmix_base_verbose 10 -mca 
> pmix_server_verbose 5 hello_1_mpi
> [tyr.informatik.hs-fulda.de:06212] mca: base: components_register: 
> registering framework pmix components
> [tyr.informatik.hs-fulda.de:06212] mca: base: components_open: opening pmix 
> components
> [tyr.informatik.hs-fulda.de:06212] mca:base:select: Auto-selecting pmix 
> components
> [tyr.informatik.hs-fulda.de:06212] mca:base:select:( pmix) No component 
> selected!
> [tyr.informatik.hs-fulda.de:06212] [[48738,0],0] ORTE_ERROR_LOG: Not found in 
> file 
> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>  at line 638
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>  opal_pmix_base_select failed
>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> tyr hello_1 110
> 
> 
> Kind regards
> 
> Siegmar
> 
> 
> Am 21.04.2016 um 17:24 schrieb Ralph Castain:
>> Hmmm…it looks like you built the right components, but they are not being 
>> picked up. Can you run your mpiexec command again, adding “-mca 
>> mca_base_component_show_load_errors 1” to the cmd line?
>> 
>> 
>>> On Apr 21, 2016, at 8:16 AM, Siegmar Gross 
>>> <siegmar.gr...@informatik.hs-fulda.de> wrote:
>>> 
>>> Hi Ralph,
>>> 
>>> I have attached ompi_info output for both compilers from my
>>> sparc machine and the listings for both compilers from the
>>> <prefix>/lib/openmpi directories. Hopefully that helps to
>>> find the problem.
>>> 
>>> hermes tmp 3 tar zvft openmpi-2.x_info.tar.gz
>>> -rw-r--r-- root/root     10969 2016-04-21 17:06 ompi_info_SunOS_sparc_cc.txt
>>> -rw-r--r-- root/root     11044 2016-04-21 17:06 
>>> ompi_info_SunOS_sparc_gcc.txt
>>> -rw-r--r-- root/root     71252 2016-04-21 17:02 lib64_openmpi.txt
>>> hermes tmp 4
>>> 
>>> 
>>> Kind regards and thank you very much once more for your help
>>> 
>>> Siegmar
>>> 
>>> 
>>> Am 21.04.2016 um 15:54 schrieb Ralph Castain:
>>>> Odd - it would appear that none of the pmix components built? Can you send
>>>> along the output from ompi_info? Or just send a listing of the files in the
>>>> <prefix>/lib/openmpi directory?
>>>> 
>>>> 
>>>>> On Apr 21, 2016, at 1:27 AM, Siegmar Gross
>>>>> <siegmar.gr...@informatik.hs-fulda.de
>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote:
>>>>> 
>>>>> Hi Ralph,
>>>>> 
>>>>> Am 21.04.2016 um 00:18 schrieb Ralph Castain:
>>>>>> Could you please rerun these test and add “-mca pmix_base_verbose 10
>>>>>> -mca pmix_server_verbose 5” to your cmd line? I need to see why the
>>>>>> pmix components failed.
>>>>> 
>>>>> 
>>>>> tyr spawn 111 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester -mca
>>>>> pmix_base_verbose 10 -mca pmix_server_verbose 5 spawn_multiple_master
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652] 
>>>>> mca:
>>>>> base: components_register: registering framework pmix components
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652] 
>>>>> mca:
>>>>> base: components_open: opening pmix components
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652]
>>>>> mca:base:select: Auto-selecting pmix components
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652]
>>>>> mca:base:select:( pmix) No component selected!
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652]
>>>>> [[52794,0],0] ORTE_ERROR_LOG: Not found in file
>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>>>> at line 638
>>>>> --------------------------------------------------------------------------
>>>>> It looks like orte_init failed for some reason; your parallel process is
>>>>> likely to abort.  There are many reasons that a parallel process can
>>>>> fail during orte_init; some of which are due to configuration or
>>>>> environment problems.  This failure appears to be an internal failure;
>>>>> here's some additional information (which may only be relevant to an
>>>>> Open MPI developer):
>>>>> 
>>>>> opal_pmix_base_select failed
>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>>> --------------------------------------------------------------------------
>>>>> tyr spawn 112
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> tyr hello_1 116 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester -mca
>>>>> pmix_base_verbose 10 -mca pmix_server_verbose 5 hello_1_mpi
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261] 
>>>>> mca:
>>>>> base: components_register: registering framework pmix components
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261] 
>>>>> mca:
>>>>> base: components_open: opening pmix components
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261]
>>>>> mca:base:select: Auto-selecting pmix components
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261]
>>>>> mca:base:select:( pmix) No component selected!
>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261]
>>>>> [[52315,0],0] ORTE_ERROR_LOG: Not found in file
>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>>>> at line 638
>>>>> --------------------------------------------------------------------------
>>>>> It looks like orte_init failed for some reason; your parallel process is
>>>>> likely to abort.  There are many reasons that a parallel process can
>>>>> fail during orte_init; some of which are due to configuration or
>>>>> environment problems.  This failure appears to be an internal failure;
>>>>> here's some additional information (which may only be relevant to an
>>>>> Open MPI developer):
>>>>> 
>>>>> opal_pmix_base_select failed
>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>>> --------------------------------------------------------------------------
>>>>> tyr hello_1 117
>>>>> 
>>>>> 
>>>>> 
>>>>> Thank you very much for your help.
>>>>> 
>>>>> 
>>>>> Kind regards
>>>>> 
>>>>> Siegmar
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Thanks
>>>>>> Ralph
>>>>>> 
>>>>>>> On Apr 20, 2016, at 10:12 AM, Siegmar Gross
>>>>>>> <siegmar.gr...@informatik.hs-fulda.de
>>>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I have built openmpi-v2.x-dev-1280-gc110ae8 on my machines
>>>>>>> (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux
>>>>>>> 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13. Unfortunately I get
>>>>>>> runtime errors for some programs.
>>>>>>> 
>>>>>>> 
>>>>>>> Sun C 5.13:
>>>>>>> ===========
>>>>>>> 
>>>>>>> For all my test programs I get the same error on Solaris Sparc and
>>>>>>> Solaris x86_64, while the programs work fine on Linux.
>>>>>>> 
>>>>>>> tyr hello_1 115 mpiexec -np 2 hello_1_mpi
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de>:22373]
>>>>>>> [[61763,0],0] ORTE_ERROR_LOG: Not found in file
>>>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>>>>>> at line 638
>>>>>>> --------------------------------------------------------------------------
>>>>>>> It looks like orte_init failed for some reason; your parallel process is
>>>>>>> likely to abort.  There are many reasons that a parallel process can
>>>>>>> fail during orte_init; some of which are due to configuration or
>>>>>>> environment problems.  This failure appears to be an internal failure;
>>>>>>> here's some additional information (which may only be relevant to an
>>>>>>> Open MPI developer):
>>>>>>> 
>>>>>>> opal_pmix_base_select failed
>>>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>>>>> --------------------------------------------------------------------------
>>>>>>> tyr hello_1 116
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> GCC-5.1.0:
>>>>>>> ==========
>>>>>>> 
>>>>>>> tyr spawn 121 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester
>>>>>>> spawn_multiple_master
>>>>>>> 
>>>>>>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>>>>>> <http://tyr.informatik.hs-fulda.de>
>>>>>>> I create 3 slave processes.
>>>>>>> 
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de>:25366]
>>>>>>> PMIX ERROR: UNPACK-PAST-END in file
>>>>>>> ../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c
>>>>>>> at line 829
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de>:25366]
>>>>>>> PMIX ERROR: UNPACK-PAST-END in file
>>>>>>> ../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c
>>>>>>> at line 2176
>>>>>>> [tyr:25377] *** An error occurred in MPI_Comm_spawn_multiple
>>>>>>> [tyr:25377] *** reported by process [3308257281,0]
>>>>>>> [tyr:25377] *** on communicator MPI_COMM_WORLD
>>>>>>> [tyr:25377] *** MPI_ERR_SPAWN: could not spawn processes
>>>>>>> [tyr:25377] *** MPI_ERRORS_ARE_FATAL (processes in this communicator 
>>>>>>> will
>>>>>>> now abort,
>>>>>>> [tyr:25377] ***    and potentially your MPI job)
>>>>>>> tyr spawn 122
>>>>>>> 
>>>>>>> 
>>>>>>> I would be grateful if somebody can fix the problems. Thank you very
>>>>>>> much for any help in advance.
>>>>>>> 
>>>>>>> 
>>>>>>> Kind regards
>>>>>>> 
>>>>>>> Siegmar
>>>>>>> <hello_1_mpi.c><spawn_multiple_master.c>_______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28983.php
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this
>>>>>> post: http://www.open-mpi.org/community/lists/users/2016/04/28986.php
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this
>>>>> post: http://www.open-mpi.org/community/lists/users/2016/04/28987.php
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2016/04/28988.php
>>>> 
>>> <openmpi-2.x_info.tar.gz>_______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2016/04/28989.php
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2016/04/28990.php
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/04/28991.php

Reply via email to