Siegmar, I will try to reproduce this on my solaris11 x86_64 vm
In the mean time, can you please double check mca_pmix_pmix_pmix112.so is a 64 bits library ? (E.g, confirm "-m64" was correctly passed to pmix) Cheers, Gilles On Friday, April 22, 2016, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi Ralph, > > I've already used "-enable-debug". "SYSTEM_ENV" is "SunOS" or > "Linux" and "MACHINE_ENV" is "sparc" or "x86_84". > > mkdir openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc > cd openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc > > ../openmpi-v2.x-dev-1280-gc110ae8/configure \ > --prefix=/usr/local/openmpi-2.0.0_64_gcc \ > --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \ > --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ > --with-jdk-headers=/usr/local/jdk1.8.0/include \ > JAVA_HOME=/usr/local/jdk1.8.0 \ > LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ > CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ > CPP="cpp" CXXCPP="cpp" \ > --enable-mpi-cxx \ > --enable-cxx-exceptions \ > --enable-mpi-java \ > --enable-heterogeneous \ > --enable-mpi-thread-multiple \ > --with-hwloc=internal \ > --without-verbs \ > --with-wrapper-cflags="-std=c11 -m64" \ > --with-wrapper-cxxflags="-m64" \ > --with-wrapper-fcflags="-m64" \ > --enable-debug \ > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc > > > mkdir openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc > cd openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc > > ../openmpi-v2.x-dev-1280-gc110ae8/configure \ > --prefix=/usr/local/openmpi-2.0.0_64_cc \ > --libdir=/usr/local/openmpi-2.0.0_64_cc/lib64 \ > --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ > --with-jdk-headers=/usr/local/jdk1.8.0/include \ > JAVA_HOME=/usr/local/jdk1.8.0 \ > LDFLAGS="-m64" CC="cc" CXX="CC" FC="f95" \ > CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \ > CPP="cpp" CXXCPP="cpp" \ > --enable-mpi-cxx \ > --enable-cxx-exceptions \ > --enable-mpi-java \ > --enable-heterogeneous \ > --enable-mpi-thread-multiple \ > --with-hwloc=internal \ > --without-verbs \ > --with-wrapper-cflags="-m64" \ > --with-wrapper-cxxflags="-m64 -library=stlport4" \ > --with-wrapper-fcflags="-m64" \ > --with-wrapper-ldflags="" \ > --enable-debug \ > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc > > > Kind regards > > Siegmar > > Am 21.04.2016 um 18:18 schrieb Ralph Castain: > >> Can you please rebuild OMPI with -enable-debug in the configure cmd? It >> will let us see more error output >> >> >> On Apr 21, 2016, at 8:52 AM, Siegmar Gross < >>> siegmar.gr...@informatik.hs-fulda.de> wrote: >>> >>> Hi Ralph, >>> >>> I don't see any additional information. >>> >>> tyr hello_1 108 mpiexec -np 4 --host tyr,sunpc1,linpc1,ruester -mca >>> mca_base_component_show_load_errors 1 hello_1_mpi >>> [tyr.informatik.hs-fulda.de:06211] [[48741,0],0] ORTE_ERROR_LOG: Not >>> found in file >>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c >>> at line 638 >>> >>> -------------------------------------------------------------------------- >>> It looks like orte_init failed for some reason; your parallel process is >>> likely to abort. There are many reasons that a parallel process can >>> fail during orte_init; some of which are due to configuration or >>> environment problems. This failure appears to be an internal failure; >>> here's some additional information (which may only be relevant to an >>> Open MPI developer): >>> >>> opal_pmix_base_select failed >>> --> Returned value Not found (-13) instead of ORTE_SUCCESS >>> >>> -------------------------------------------------------------------------- >>> >>> >>> tyr hello_1 109 mpiexec -np 4 --host tyr,sunpc1,linpc1,ruester -mca >>> mca_base_component_show_load_errors 1 -mca pmix_base_verbose 10 -mca >>> pmix_server_verbose 5 hello_1_mpi >>> [tyr.informatik.hs-fulda.de:06212] mca: base: components_register: >>> registering framework pmix components >>> [tyr.informatik.hs-fulda.de:06212] mca: base: components_open: opening >>> pmix components >>> [tyr.informatik.hs-fulda.de:06212] mca:base:select: Auto-selecting pmix >>> components >>> [tyr.informatik.hs-fulda.de:06212] mca:base:select:( pmix) No component >>> selected! >>> [tyr.informatik.hs-fulda.de:06212] [[48738,0],0] ORTE_ERROR_LOG: Not >>> found in file >>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c >>> at line 638 >>> >>> -------------------------------------------------------------------------- >>> It looks like orte_init failed for some reason; your parallel process is >>> likely to abort. There are many reasons that a parallel process can >>> fail during orte_init; some of which are due to configuration or >>> environment problems. This failure appears to be an internal failure; >>> here's some additional information (which may only be relevant to an >>> Open MPI developer): >>> >>> opal_pmix_base_select failed >>> --> Returned value Not found (-13) instead of ORTE_SUCCESS >>> >>> -------------------------------------------------------------------------- >>> tyr hello_1 110 >>> >>> >>> Kind regards >>> >>> Siegmar >>> >>> >>> Am 21.04.2016 um 17:24 schrieb Ralph Castain: >>> >>>> Hmmm…it looks like you built the right components, but they are not >>>> being picked up. Can you run your mpiexec command again, adding “-mca >>>> mca_base_component_show_load_errors 1” to the cmd line? >>>> >>>> >>>> On Apr 21, 2016, at 8:16 AM, Siegmar Gross < >>>>> siegmar.gr...@informatik.hs-fulda.de> wrote: >>>>> >>>>> Hi Ralph, >>>>> >>>>> I have attached ompi_info output for both compilers from my >>>>> sparc machine and the listings for both compilers from the >>>>> <prefix>/lib/openmpi directories. Hopefully that helps to >>>>> find the problem. >>>>> >>>>> hermes tmp 3 tar zvft openmpi-2.x_info.tar.gz >>>>> -rw-r--r-- root/root 10969 2016-04-21 17:06 >>>>> ompi_info_SunOS_sparc_cc.txt >>>>> -rw-r--r-- root/root 11044 2016-04-21 17:06 >>>>> ompi_info_SunOS_sparc_gcc.txt >>>>> -rw-r--r-- root/root 71252 2016-04-21 17:02 lib64_openmpi.txt >>>>> hermes tmp 4 >>>>> >>>>> >>>>> Kind regards and thank you very much once more for your help >>>>> >>>>> Siegmar >>>>> >>>>> >>>>> Am 21.04.2016 um 15:54 schrieb Ralph Castain: >>>>> >>>>>> Odd - it would appear that none of the pmix components built? Can you >>>>>> send >>>>>> along the output from ompi_info? Or just send a listing of the files >>>>>> in the >>>>>> <prefix>/lib/openmpi directory? >>>>>> >>>>>> >>>>>> On Apr 21, 2016, at 1:27 AM, Siegmar Gross >>>>>>> <siegmar.gr...@informatik.hs-fulda.de >>>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote: >>>>>>> >>>>>>> Hi Ralph, >>>>>>> >>>>>>> Am 21.04.2016 um 00:18 schrieb Ralph Castain: >>>>>>> >>>>>>>> Could you please rerun these test and add “-mca pmix_base_verbose 10 >>>>>>>> -mca pmix_server_verbose 5” to your cmd line? I need to see why the >>>>>>>> pmix components failed. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> tyr spawn 111 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester -mca >>>>>>> pmix_base_verbose 10 -mca pmix_server_verbose 5 spawn_multiple_master >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652] >>>>>>> mca: >>>>>>> base: components_register: registering framework pmix components >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:26652] >>>>>>> mca: >>>>>>> base: components_open: opening pmix components >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/ >>>>>>> >:26652] >>>>>>> mca:base:select: Auto-selecting pmix components >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/ >>>>>>> >:26652] >>>>>>> mca:base:select:( pmix) No component selected! >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/ >>>>>>> >:26652] >>>>>>> [[52794,0],0] ORTE_ERROR_LOG: Not found in file >>>>>>> >>>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c >>>>>>> at line 638 >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> It looks like orte_init failed for some reason; your parallel >>>>>>> process is >>>>>>> likely to abort. There are many reasons that a parallel process can >>>>>>> fail during orte_init; some of which are due to configuration or >>>>>>> environment problems. This failure appears to be an internal >>>>>>> failure; >>>>>>> here's some additional information (which may only be relevant to an >>>>>>> Open MPI developer): >>>>>>> >>>>>>> opal_pmix_base_select failed >>>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> tyr spawn 112 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> tyr hello_1 116 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester -mca >>>>>>> pmix_base_verbose 10 -mca pmix_server_verbose 5 hello_1_mpi >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261] >>>>>>> mca: >>>>>>> base: components_register: registering framework pmix components >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/>:27261] >>>>>>> mca: >>>>>>> base: components_open: opening pmix components >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/ >>>>>>> >:27261] >>>>>>> mca:base:select: Auto-selecting pmix components >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/ >>>>>>> >:27261] >>>>>>> mca:base:select:( pmix) No component selected! >>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de/ >>>>>>> >:27261] >>>>>>> [[52315,0],0] ORTE_ERROR_LOG: Not found in file >>>>>>> >>>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c >>>>>>> at line 638 >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> It looks like orte_init failed for some reason; your parallel >>>>>>> process is >>>>>>> likely to abort. There are many reasons that a parallel process can >>>>>>> fail during orte_init; some of which are due to configuration or >>>>>>> environment problems. This failure appears to be an internal >>>>>>> failure; >>>>>>> here's some additional information (which may only be relevant to an >>>>>>> Open MPI developer): >>>>>>> >>>>>>> opal_pmix_base_select failed >>>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> tyr hello_1 117 >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thank you very much for your help. >>>>>>> >>>>>>> >>>>>>> Kind regards >>>>>>> >>>>>>> Siegmar >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Thanks >>>>>>>> Ralph >>>>>>>> >>>>>>>> On Apr 20, 2016, at 10:12 AM, Siegmar Gross >>>>>>>>> <siegmar.gr...@informatik.hs-fulda.de >>>>>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I have built openmpi-v2.x-dev-1280-gc110ae8 on my machines >>>>>>>>> (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux >>>>>>>>> 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13. Unfortunately I get >>>>>>>>> runtime errors for some programs. >>>>>>>>> >>>>>>>>> >>>>>>>>> Sun C 5.13: >>>>>>>>> =========== >>>>>>>>> >>>>>>>>> For all my test programs I get the same error on Solaris Sparc and >>>>>>>>> Solaris x86_64, while the programs work fine on Linux. >>>>>>>>> >>>>>>>>> tyr hello_1 115 mpiexec -np 2 hello_1_mpi >>>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de >>>>>>>>> >:22373] >>>>>>>>> [[61763,0],0] ORTE_ERROR_LOG: Not found in file >>>>>>>>> >>>>>>>>> ../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c >>>>>>>>> at line 638 >>>>>>>>> >>>>>>>>> -------------------------------------------------------------------------- >>>>>>>>> It looks like orte_init failed for some reason; your parallel >>>>>>>>> process is >>>>>>>>> likely to abort. There are many reasons that a parallel process >>>>>>>>> can >>>>>>>>> fail during orte_init; some of which are due to configuration or >>>>>>>>> environment problems. This failure appears to be an internal >>>>>>>>> failure; >>>>>>>>> here's some additional information (which may only be relevant to >>>>>>>>> an >>>>>>>>> Open MPI developer): >>>>>>>>> >>>>>>>>> opal_pmix_base_select failed >>>>>>>>> --> Returned value Not found (-13) instead of ORTE_SUCCESS >>>>>>>>> >>>>>>>>> -------------------------------------------------------------------------- >>>>>>>>> tyr hello_1 116 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> GCC-5.1.0: >>>>>>>>> ========== >>>>>>>>> >>>>>>>>> tyr spawn 121 mpiexec -np 1 --host tyr,sunpc1,linpc1,ruester >>>>>>>>> spawn_multiple_master >>>>>>>>> >>>>>>>>> Parent process 0 running on tyr.informatik.hs-fulda.de >>>>>>>>> <http://tyr.informatik.hs-fulda.de> >>>>>>>>> I create 3 slave processes. >>>>>>>>> >>>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de >>>>>>>>> >:25366] >>>>>>>>> PMIX ERROR: UNPACK-PAST-END in file >>>>>>>>> >>>>>>>>> ../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c >>>>>>>>> at line 829 >>>>>>>>> [tyr.informatik.hs-fulda.de <http://tyr.informatik.hs-fulda.de >>>>>>>>> >:25366] >>>>>>>>> PMIX ERROR: UNPACK-PAST-END in file >>>>>>>>> >>>>>>>>> ../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c >>>>>>>>> at line 2176 >>>>>>>>> [tyr:25377] *** An error occurred in MPI_Comm_spawn_multiple >>>>>>>>> [tyr:25377] *** reported by process [3308257281,0] >>>>>>>>> [tyr:25377] *** on communicator MPI_COMM_WORLD >>>>>>>>> [tyr:25377] *** MPI_ERR_SPAWN: could not spawn processes >>>>>>>>> [tyr:25377] *** MPI_ERRORS_ARE_FATAL (processes in this >>>>>>>>> communicator will >>>>>>>>> now abort, >>>>>>>>> [tyr:25377] *** and potentially your MPI job) >>>>>>>>> tyr spawn 122 >>>>>>>>> >>>>>>>>> >>>>>>>>> I would be grateful if somebody can fix the problems. Thank you >>>>>>>>> very >>>>>>>>> much for any help in advance. >>>>>>>>> >>>>>>>>> >>>>>>>>> Kind regards >>>>>>>>> >>>>>>>>> Siegmar >>>>>>>>> >>>>>>>>> <hello_1_mpi.c><spawn_multiple_master.c>_______________________________________________ >>>>>>>>> users mailing list >>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>> Link to this post: >>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28983.php >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>> Link to this >>>>>>>> post: >>>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28986.php >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> Link to this >>>>>>> post: >>>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28987.php >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2016/04/28988.php >>>>>> >>>>>> >>>>> <openmpi-2.x_info.tar.gz>_______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2016/04/28989.php >>>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2016/04/28990.php >>>> >>>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2016/04/28991.php >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/04/28992.php >> >> _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/04/28993.php >