Siegmar,

I will try to reproduce this on my solaris11 x86_64 vm

In the mean time, can you please double check mca_pmix_pmix_pmix112.so is a
64 bits library ?
(E.g, confirm "-m64" was correctly passed to pmix)

Cheers,

Gilles

On Friday, April 22, 2016, Siegmar Gross <
siegmar.gr...@informatik.hs-fulda.de> wrote:

> Hi Ralph,
>
> I've already used "-enable-debug". "SYSTEM_ENV" is "SunOS" or
> "Linux" and "MACHINE_ENV" is "sparc" or "x86_84".
>
> mkdir openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc
> cd openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc
>
> ../openmpi-v2.x-dev-1280-gc110ae8/configure \
>   --prefix=/usr/local/openmpi-2.0.0_64_gcc \
>   --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \
>   --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
>   --with-jdk-headers=/usr/local/jdk1.8.0/include \
>   JAVA_HOME=/usr/local/jdk1.8.0 \
>   LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \
>   CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \
>   CPP="cpp" CXXCPP="cpp" \
>   --enable-mpi-cxx \
>   --enable-cxx-exceptions \
>   --enable-mpi-java \
>   --enable-heterogeneous \
>   --enable-mpi-thread-multiple \
>   --with-hwloc=internal \
>   --without-verbs \
>   --with-wrapper-cflags="-std=c11 -m64" \
>   --with-wrapper-cxxflags="-m64" \
>   --with-wrapper-fcflags="-m64" \
>   --enable-debug \
>   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc
>
>
> mkdir openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
> cd openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
>
> ../openmpi-v2.x-dev-1280-gc110ae8/configure \
>   --prefix=/usr/local/openmpi-2.0.0_64_cc \
>   --libdir=/usr/local/openmpi-2.0.0_64_cc/lib64 \
>   --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
>   --with-jdk-headers=/usr/local/jdk1.8.0/include \
>   JAVA_HOME=/usr/local/jdk1.8.0 \
>   LDFLAGS="-m64" CC="cc" CXX="CC" FC="f95" \
>   CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \
>   CPP="cpp" CXXCPP="cpp" \
>   --enable-mpi-cxx \
>   --enable-cxx-exceptions \
>   --enable-mpi-java \
>   --enable-heterogeneous \
>   --enable-mpi-thread-multiple \
>   --with-hwloc=internal \
>   --without-verbs \
>   --with-wrapper-cflags="-m64" \
>   --with-wrapper-cxxflags="-m64 -library=stlport4" \
>   --with-wrapper-fcflags="-m64" \
>   --with-wrapper-ldflags="" \
>   --enable-debug \
>   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>
>
> Kind regards
>
> Siegmar
>
> Am 21.04.2016 um 18:18 schrieb Ralph Castain:
>
>> Can you please rebuild OMPI with -enable-debug in the configure cmd? It
>> will let us see more error output
>>
>>
>> On Apr 21, 2016, at 8:52 AM, Siegmar Gross <
>>> siegmar.gr...@informatik.hs-fulda.de> wrote:
>>>
>>> Hi Ralph,
>>>
>>> I don't see any additional information.
>>>
>>> tyr hello_1 108 mpiexec -np 4 --host tyr,sunpc1,linpc1,ruester -mca
>>> mca_base_component_show_load_errors 1 hello_1_mpi
>>> [tyr.informatik.hs-fulda.de:06211] [[48741,0],0] ORTE_ERROR_LOG: Not
>>> found in file
>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>> at line 638
>>>
>>> --------------------------------------------------------------------------
>>> It looks like orte_init failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during orte_init; some of which are due to configuration or
>>> environment problems.  This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>>
>>>  opal_pmix_base_select failed
>>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>
>>> --------------------------------------------------------------------------
>>>
>>>
>>> tyr hello_1 109 mpiexec -np 4 --host tyr,sunpc1,linpc1,ruester -mca
>>> mca_base_component_show_load_errors 1 -mca pmix_base_verbose 10 -mca
>>> pmix_server_verbose 5 hello_1_mpi
>>> [tyr.informatik.hs-fulda.de:06212] mca: base: components_register:
>>> registering framework pmix components
>>> [tyr.informatik.hs-fulda.de:06212] mca: base: components_open: opening
>>> pmix components
>>> [tyr.informatik.hs-fulda.de:06212] mca:base:select: Auto-selecting pmix
>>> components
>>> [tyr.informatik.hs-fulda.de:06212] mca:base:select:( pmix) No component
>>> selected!
>>> [tyr.informatik.hs-fulda.de:06212] [[48738,0],0] ORTE_ERROR_LOG: Not
>>> found in file
>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>> at line 638
>>>
>>> --------------------------------------------------------------------------
>>> It looks like orte_init failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during orte_init; some of which are due to configuration or
>>> environment problems.  This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>>
>>>  opal_pmix_base_select failed
>>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>
>>> --------------------------------------------------------------------------
>>> tyr hello_1 110
>>>
>>>
>>> Kind regards
>>>
>>> Siegmar
>>>
>>>
>>> Am 21.04.2016 um 17:24 schrieb Ralph Castain:
>>>
>>>> Hmmm…it looks like you built the right components, but they are not
>>>> being picked up. Can you run your mpiexec command again, adding “-mca
>>>> mca_base_component_show_load_errors 1” to the cmd line?
>>>>
>>>>
>>>> On Apr 21, 2016, at 8:16 AM, Siegmar Gross <
>>>>> siegmar.gr...@informatik.hs-fulda.de> wrote:
>>>>>
>>>>> Hi Ralph,
>>>>>
>>>>> I have attached ompi_info output for both compilers from my
>>>>> sparc machine and the listings for both compilers from the
>>>>> <prefix>/lib/openmpi directories. Hopefully that helps to
>>>>> find the problem.
>>>>>
>>>>> hermes tmp 3 tar zvft openmpi-2.x_info.tar.gz
>>>>> -rw-r--r-- root/root     10969 2016-04-21 17:06
>>>>> ompi_info_SunOS_sparc_cc.txt
>>>>> -rw-r--r-- root/root     11044 2016-04-21 17:06
>>>>> ompi_info_SunOS_sparc_gcc.txt
>>>>> -rw-r--r-- root/root     71252 2016-04-21 17:02 lib64_openmpi.txt
>>>>> hermes tmp 4
>>>>>
>>>>>
>>>>> Kind regards and thank you very much once more for your help
>>>>>
>>>>> Siegmar
>>>>>
>>>>>
>>>>> Am 21.04.2016 um 15:54 schrieb Ralph Castain:
>>>>>
>>>>>> Odd - it would appear that none of the pmix components built? Can you
>>>>>> send
>>>>>> along the output from ompi_info? Or just send a listing of the files
>>>>>> in the
>>>>>> <prefix>/lib/openmpi directory?
>>>>>>
>>>>>>
>>>>>> On Apr 21, 2016, at 1:27 AM, Siegmar Gross
>>>>>>> <siegmar.gr...@informatik.hs-fulda.de
>>>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote:
>>>>>>>
>>>>>>> Hi Ralph,
>>>>>>>
>>>>>>> Am 21.04.2016 um 00:18 schrieb Ralph Castain:
>>>>>>>
>>>>>>>> Could you please rerun these test and add “-mca pmix_base_verbose 10
>>>>>>>> -mca pmix_server_verbose 5” to your cmd line? I need to see why the
>>>>>>>> pmix components failed.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> tyr spawn 111 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester -mca
>>>>>>> pmix_base_verbose 10 -mca pmix_server_verbose 5 spawn_multiple_master
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652]
>>>>>>> mca:
>>>>>>> base: components_register: registering framework pmix components
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652]
>>>>>>> mca:
>>>>>>> base: components_open: opening pmix components
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/
>>>>>>> >:26652]
>>>>>>> mca:base:select: Auto-selecting pmix components
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/
>>>>>>> >:26652]
>>>>>>> mca:base:select:( pmix) No component selected!
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/
>>>>>>> >:26652]
>>>>>>> [[52794,0],0] ORTE_ERROR_LOG: Not found in file
>>>>>>>
>>>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>>>>>> at line 638
>>>>>>>
>>>>>>> --------------------------------------------------------------------------
>>>>>>> It looks like orte_init failed for some reason; your parallel
>>>>>>> process is
>>>>>>> likely to abort.  There are many reasons that a parallel process can
>>>>>>> fail during orte_init; some of which are due to configuration or
>>>>>>> environment problems.  This failure appears to be an internal
>>>>>>> failure;
>>>>>>> here's some additional information (which may only be relevant to an
>>>>>>> Open MPI developer):
>>>>>>>
>>>>>>> opal_pmix_base_select failed
>>>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>>>>>
>>>>>>> --------------------------------------------------------------------------
>>>>>>> tyr spawn 112
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> tyr hello_1 116 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester -mca
>>>>>>> pmix_base_verbose 10 -mca pmix_server_verbose 5 hello_1_mpi
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261]
>>>>>>> mca:
>>>>>>> base: components_register: registering framework pmix components
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261]
>>>>>>> mca:
>>>>>>> base: components_open: opening pmix components
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/
>>>>>>> >:27261]
>>>>>>> mca:base:select: Auto-selecting pmix components
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/
>>>>>>> >:27261]
>>>>>>> mca:base:select:( pmix) No component selected!
>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/
>>>>>>> >:27261]
>>>>>>> [[52315,0],0] ORTE_ERROR_LOG: Not found in file
>>>>>>>
>>>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>>>>>> at line 638
>>>>>>>
>>>>>>> --------------------------------------------------------------------------
>>>>>>> It looks like orte_init failed for some reason; your parallel
>>>>>>> process is
>>>>>>> likely to abort.  There are many reasons that a parallel process can
>>>>>>> fail during orte_init; some of which are due to configuration or
>>>>>>> environment problems.  This failure appears to be an internal
>>>>>>> failure;
>>>>>>> here's some additional information (which may only be relevant to an
>>>>>>> Open MPI developer):
>>>>>>>
>>>>>>> opal_pmix_base_select failed
>>>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>>>>>
>>>>>>> --------------------------------------------------------------------------
>>>>>>> tyr hello_1 117
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thank you very much for your help.
>>>>>>>
>>>>>>>
>>>>>>> Kind regards
>>>>>>>
>>>>>>> Siegmar
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Ralph
>>>>>>>>
>>>>>>>> On Apr 20, 2016, at 10:12 AM, Siegmar Gross
>>>>>>>>> <siegmar.gr...@informatik.hs-fulda.de
>>>>>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have built openmpi-v2.x-dev-1280-gc110ae8 on my machines
>>>>>>>>> (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux
>>>>>>>>> 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13. Unfortunately I get
>>>>>>>>> runtime errors for some programs.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sun C 5.13:
>>>>>>>>> ===========
>>>>>>>>>
>>>>>>>>> For all my test programs I get the same error on Solaris Sparc and
>>>>>>>>> Solaris x86_64, while the programs work fine on Linux.
>>>>>>>>>
>>>>>>>>> tyr hello_1 115 mpiexec -np 2 hello_1_mpi
>>>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de
>>>>>>>>> >:22373]
>>>>>>>>> [[61763,0],0] ORTE_ERROR_LOG: Not found in file
>>>>>>>>>
>>>>>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
>>>>>>>>> at line 638
>>>>>>>>>
>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>> It looks like orte_init failed for some reason; your parallel
>>>>>>>>> process is
>>>>>>>>> likely to abort.  There are many reasons that a parallel process
>>>>>>>>> can
>>>>>>>>> fail during orte_init; some of which are due to configuration or
>>>>>>>>> environment problems.  This failure appears to be an internal
>>>>>>>>> failure;
>>>>>>>>> here's some additional information (which may only be relevant to
>>>>>>>>> an
>>>>>>>>> Open MPI developer):
>>>>>>>>>
>>>>>>>>> opal_pmix_base_select failed
>>>>>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>>>>>>>>
>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>> tyr hello_1 116
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> GCC-5.1.0:
>>>>>>>>> ==========
>>>>>>>>>
>>>>>>>>> tyr spawn 121 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester
>>>>>>>>> spawn_multiple_master
>>>>>>>>>
>>>>>>>>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>>>>>>>> <http://tyr.informatik.hs-fulda.de>
>>>>>>>>> I create 3 slave processes.
>>>>>>>>>
>>>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de
>>>>>>>>> >:25366]
>>>>>>>>> PMIX ERROR: UNPACK-PAST-END in file
>>>>>>>>>
>>>>>>>>> ../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c
>>>>>>>>> at line 829
>>>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de
>>>>>>>>> >:25366]
>>>>>>>>> PMIX ERROR: UNPACK-PAST-END in file
>>>>>>>>>
>>>>>>>>> ../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c
>>>>>>>>> at line 2176
>>>>>>>>> [tyr:25377] *** An error occurred in MPI_Comm_spawn_multiple
>>>>>>>>> [tyr:25377] *** reported by process [3308257281,0]
>>>>>>>>> [tyr:25377] *** on communicator MPI_COMM_WORLD
>>>>>>>>> [tyr:25377] *** MPI_ERR_SPAWN: could not spawn processes
>>>>>>>>> [tyr:25377] *** MPI_ERRORS_ARE_FATAL (processes in this
>>>>>>>>> communicator will
>>>>>>>>> now abort,
>>>>>>>>> [tyr:25377] ***    and potentially your MPI job)
>>>>>>>>> tyr spawn 122
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I would be grateful if somebody can fix the problems. Thank you
>>>>>>>>> very
>>>>>>>>> much for any help in advance.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Kind regards
>>>>>>>>>
>>>>>>>>> Siegmar
>>>>>>>>>
>>>>>>>>> <hello_1_mpi.c><spawn_multiple_master.c>_______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this post:
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28983.php
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> Link to this
>>>>>>>> post:
>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28986.php
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this
>>>>>>> post:
>>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28987.php
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28988.php
>>>>>>
>>>>>>
>>>>> <openmpi-2.x_info.tar.gz>_______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28989.php
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/04/28990.php
>>>>
>>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/04/28991.php
>>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/04/28992.php
>>
>> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/04/28993.php
>

Reply via email to