Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8
Siegmar, here is the error : configure:17969: cc -o conftest -m64 -D_REENTRANT -g -g -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1 -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/include -D_REENTRANT -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/hwloc/hwloc1112/hwloc/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/hwloc/hwloc1112/hwloc/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/event/libevent2022/libevent/include -m64 conftest.c >&5 "/usr/include/stdbool.h", line 42: #error: "Use of is valid only in a c99 compilation environment." i cannot reproduce this on solaris 11 with oracle studio 5.3 compiler, and i do not have solaris 10 yet. could you please re-configure with '-std=c99' appended to your CFLAGS and see if it helps ? Cheers, Gilles On 4/26/2016 7:57 PM, Siegmar Gross wrote: Hi Gilles and Ralph, I was able to sort out my mess. In my last email I compared the files from "SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi" from the attachment of my email to Ralph with the files from "SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi" from my current file system. That's the reason while I have had different timestamps. The other problem was that Ralph didn't recognize that "mca_pmix_pmix112.so" wasn't built on Solaris with the Sun C compiler. I've removed most of the files from the attachment of my email so that it is easier to see the relevant files. Below I try to give you more information that may be relevant to track down the problem. I still get an error running one of my small test programs, when I use my gcc-version of Open MPI. "mca_pmix_pmix112.so" is a 64 bits library. Linux_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi: ... -rwxr-xr-x 1 root root 261327 Apr 19 16:46 mca_plm_slurm.so -rwxr-xr-x 1 root root1002 Apr 19 16:45 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 3906526 Apr 19 16:45 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 966 Apr 19 16:51 mca_pml_cm.la -rwxr-xr-x 1 root root 1574265 Apr 19 16:51 mca_pml_cm.so ... Linux_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi: ... -rwxr-xr-x 1 root root 70371 Apr 19 16:43 mca_plm_slurm.so -rwxr-xr-x 1 root root1008 Apr 19 16:42 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 1029005 Apr 19 16:42 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 972 Apr 19 16:46 mca_pml_cm.la -rwxr-xr-x 1 root root 284858 Apr 19 16:46 mca_pml_cm.so ... SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi: ... -rwxr-xr-x 1 root root 319816 Apr 19 19:58 mca_plm_rsh.so -rwxr-xr-x 1 root root 970 Apr 19 20:00 mca_pml_cm.la -rwxr-xr-x 1 root root 1507440 Apr 19 20:00 mca_pml_cm.so ... SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi: ... -rwxr-xr-x 1 root root 153280 Apr 19 19:49 mca_plm_rsh.so -rwxr-xr-x 1 root root1007 Apr 19 19:47 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 1400512 Apr 19 19:47 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 971 Apr 19 19:52 mca_pml_cm.la -rwxr-xr-x 1 root root 342440 Apr 19 19:52 mca_pml_cm.so ... SunOS_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi: ... -rwxr-xr-x 1 root root 300096 Apr 19 17:18 mca_plm_rsh.so -rwxr-xr-x 1 root root 970 Apr 19 17:23 mca_pml_cm.la -rwxr-xr-x 1 root root 1458816 Apr 19 17:23 mca_pml_cm.so ... SunOS_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi: ... -rwxr-xr-x 1 root root 133096 Apr 19 17:42 mca_plm_rsh.so -rwxr-xr-x 1 root root1007 Apr 19 17:41 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 1320240 Apr 19 17:41 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 971 Apr 19 17:46 mca_pml_cm.la -rwxr-xr-x 1 root root 419848 Apr 19 17:46 mca_pml_cm.so ... Yesterday I've installed openmpi-v2.x-dev-1290-gbd0e4e1 so that we have a current version for the investigation of the problem. Once more mca_pmix_pmix112.so isn't available on Solaris if I use the Sun C compiler. "config.log" for gcc-5.1.0 shows the following. ... configure:127799: /bin/bash '../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/ pmix/configure' succeeded for opal/mca/pmix/pmix112/pmix configure:127916: checking if MCA component pmix:pmix112 can compile configure:127918: result: yes configure:5637: --- MCA component pmix:external (m4 configuration macro) configure:128523: checking for MCA component pmix:external compile mode configure:128529: result: dso configure:129054: checking if MCA component pmix:external can compile configure:129056: result: no ... config.status:3897: creating opal/mca/pmix/Makefile config.status:3897: creating opal/mca/pmix/s1/Makefile config.status:38
Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8
Hi Gilles, adding "-std=c99" to CFLAGS solves the problem with the missing library. Shall I add it permanently to my configure command or will you add it, so that I will not run into problems if you need the C11 standard later? "spawn_multiple_master" breaks with the same error that I reported yesterday for my gcc-version of Open MPI. Hopefully you can solve the problem as well. Kind regards and thank you very much for your help Siegmar Am 27.04.2016 um 08:05 schrieb Gilles Gouaillardet: Siegmar, here is the error : configure:17969: cc -o conftest -m64 -D_REENTRANT -g -g -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1 -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/include -D_REENTRANT -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/hwloc/hwloc1112/hwloc/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/hwloc/hwloc1112/hwloc/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent/include -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/event/libevent2022/libevent/include -m64 conftest.c >&5 "/usr/include/stdbool.h", line 42: #error: "Use of is valid only in a c99 compilation environment." i cannot reproduce this on solaris 11 with oracle studio 5.3 compiler, and i do not have solaris 10 yet. could you please re-configure with '-std=c99' appended to your CFLAGS and see if it helps ? Cheers, Gilles On 4/26/2016 7:57 PM, Siegmar Gross wrote: Hi Gilles and Ralph, I was able to sort out my mess. In my last email I compared the files from "SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi" from the attachment of my email to Ralph with the files from "SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi" from my current file system. That's the reason while I have had different timestamps. The other problem was that Ralph didn't recognize that "mca_pmix_pmix112.so" wasn't built on Solaris with the Sun C compiler. I've removed most of the files from the attachment of my email so that it is easier to see the relevant files. Below I try to give you more information that may be relevant to track down the problem. I still get an error running one of my small test programs, when I use my gcc-version of Open MPI. "mca_pmix_pmix112.so" is a 64 bits library. Linux_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi: ... -rwxr-xr-x 1 root root 261327 Apr 19 16:46 mca_plm_slurm.so -rwxr-xr-x 1 root root1002 Apr 19 16:45 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 3906526 Apr 19 16:45 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 966 Apr 19 16:51 mca_pml_cm.la -rwxr-xr-x 1 root root 1574265 Apr 19 16:51 mca_pml_cm.so ... Linux_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi: ... -rwxr-xr-x 1 root root 70371 Apr 19 16:43 mca_plm_slurm.so -rwxr-xr-x 1 root root1008 Apr 19 16:42 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 1029005 Apr 19 16:42 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 972 Apr 19 16:46 mca_pml_cm.la -rwxr-xr-x 1 root root 284858 Apr 19 16:46 mca_pml_cm.so ... SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi: ... -rwxr-xr-x 1 root root 319816 Apr 19 19:58 mca_plm_rsh.so -rwxr-xr-x 1 root root 970 Apr 19 20:00 mca_pml_cm.la -rwxr-xr-x 1 root root 1507440 Apr 19 20:00 mca_pml_cm.so ... SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi: ... -rwxr-xr-x 1 root root 153280 Apr 19 19:49 mca_plm_rsh.so -rwxr-xr-x 1 root root1007 Apr 19 19:47 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 1400512 Apr 19 19:47 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 971 Apr 19 19:52 mca_pml_cm.la -rwxr-xr-x 1 root root 342440 Apr 19 19:52 mca_pml_cm.so ... SunOS_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi: ... -rwxr-xr-x 1 root root 300096 Apr 19 17:18 mca_plm_rsh.so -rwxr-xr-x 1 root root 970 Apr 19 17:23 mca_pml_cm.la -rwxr-xr-x 1 root root 1458816 Apr 19 17:23 mca_pml_cm.so ... SunOS_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi: ... -rwxr-xr-x 1 root root 133096 Apr 19 17:42 mca_plm_rsh.so -rwxr-xr-x 1 root root1007 Apr 19 17:41 mca_pmix_pmix112.la -rwxr-xr-x 1 root root 1320240 Apr 19 17:41 mca_pmix_pmix112.so -rwxr-xr-x 1 root root 971 Apr 19 17:46 mca_pml_cm.la -rwxr-xr-x 1 root root 419848 Apr 19 17:46 mca_pml_cm.so ... Yesterday I've installed openmpi-v2.x-dev-1290-gbd0e4e1 so that we have a current version for the investigation of the problem. Once more mca_pmix_pmix112.so isn't available on Solaris if I use the Sun C compiler. "config.log" for gcc-5.1.0 shows the following. ... configure:127799: /bin/bash '../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/ pmix/configure' succeeded for opal/mca/pmix/pmix112/pmix configure:127916:
[OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8
Siegmar, please add this to your CFLAGS for the time being. configure tries to detect which flags must be added for C99 support, and it seems the test is not working for Solaris 10 and Oracle compilers. this is no more a widely used environment, and I am not sure I can find the time to fix this in a near future. regarding the runtime issue, can you please describe your 4 hosts (os, endianness and bitness) Cheers, Gilles On Wednesday, April 27, 2016, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de > wrote: > Hi Gilles, > > adding "-std=c99" to CFLAGS solves the problem with the missing library. > Shall I add it permanently to my configure command or will you add it, > so that I will not run into problems if you need the C11 standard later? > > "spawn_multiple_master" breaks with the same error that I reported > yesterday for my gcc-version of Open MPI. Hopefully you can solve the > problem as well. > > > Kind regards and thank you very much for your help > > Siegmar > > > Am 27.04.2016 um 08:05 schrieb Gilles Gouaillardet: > >> Siegmar, >> >> >> here is the error : >> >> configure:17969: cc -o conftest -m64 -D_REENTRANT -g -g >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1 >> >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/include >> >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/include >> -D_REENTRANT >> >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/hwloc/hwloc1112/hwloc/include >> >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/hwloc/hwloc1112/hwloc/include >> >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent >> >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent/include >> >> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/event/libevent2022/libevent/include >> -m64 conftest.c >&5 >> "/usr/include/stdbool.h", line 42: #error: "Use of is valid >> only >> in a c99 compilation environment." >> >> >> i cannot reproduce this on solaris 11 with oracle studio 5.3 compiler, >> and i >> do not have solaris 10 yet. >> >> could you please re-configure with '-std=c99' appended to your CFLAGS and >> see >> if it helps ? >> >> >> Cheers, >> >> >> Gilles >> >> >> On 4/26/2016 7:57 PM, Siegmar Gross wrote: >> >>> Hi Gilles and Ralph, >>> >>> I was able to sort out my mess. In my last email I compared the >>> files from "SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi" from >>> the attachment of my email to Ralph with the files from >>> "SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi" from my current >>> file system. That's the reason while I have had different >>> timestamps. The other problem was that Ralph didn't recognize >>> that "mca_pmix_pmix112.so" wasn't built on Solaris with the >>> Sun C compiler. I've removed most of the files from the attachment >>> of my email so that it is easier to see the relevant files. Below >>> I try to give you more information that may be relevant to track >>> down the problem. I still get an error running one of my small >>> test programs, when I use my gcc-version of Open MPI. >>> "mca_pmix_pmix112.so" is a 64 bits library. >>> >>> Linux_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi: >>> ... >>> -rwxr-xr-x 1 root root 261327 Apr 19 16:46 mca_plm_slurm.so >>> -rwxr-xr-x 1 root root1002 Apr 19 16:45 mca_pmix_pmix112.la >>> -rwxr-xr-x 1 root root 3906526 Apr 19 16:45 mca_pmix_pmix112.so >>> -rwxr-xr-x 1 root root 966 Apr 19 16:51 mca_pml_cm.la >>> -rwxr-xr-x 1 root root 1574265 Apr 19 16:51 mca_pml_cm.so >>> ... >>> >>> Linux_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi: >>> ... >>> -rwxr-xr-x 1 root root 70371 Apr 19 16:43 mca_plm_slurm.so >>> -rwxr-xr-x 1 root root1008 Apr 19 16:42 mca_pmix_pmix112.la >>> -rwxr-xr-x 1 root root 1029005 Apr 19 16:42 mca_pmix_pmix112.so >>> -rwxr-xr-x 1 root root 972 Apr 19 16:46 mca_pml_cm.la >>> -rwxr-xr-x 1 root root 284858 Apr 19 16:46 mca_pml_cm.so >>> ... >>> >>> SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi: >>> ... >>> -rwxr-xr-x 1 root root 319816 Apr 19 19:58 mca_plm_rsh.so >>> -rwxr-xr-x 1 root root 970 Apr 19 20:00 mca_pml_cm.la >>> -rwxr-xr-x 1 root root 1507440 Apr 19 20:00 mca_pml_cm.so >>> ... >>> >>> SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi: >>> ... >>> -rwxr-xr-x 1 root root 153280 Apr 19 19:49 mca_plm_rsh.so >>> -rwxr-xr-x 1 root root1007 Apr 19 19:47 mca_pmix_pmix112.la >>> -rwxr-xr-x 1 root root 1400512 Apr 19 19:47 mca_pmix_pmix112.so >>> -rwxr-xr-x 1 root root 971 Apr 19 19:52 mca_pml_cm.la >>> -rwxr-xr-x 1 root root 342440 Apr 19 19:52 mca_pml_cm.so >>> ... >>> >>> SunOS_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi: >>> ... >>> -rwxr-xr-x 1 root root 300096 Apr 19 17:18 mca_plm_rsh.so >>> -rwxr-xr-x 1 root root 970 Apr 19 17:
Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8
Hi Gilles, it is not necessary to have a heterogeneous environment to reproduce the error as you can see below. All machines are 64 bit. tyr spawn 119 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute" OPAL repo revision: v2.x-dev-1290-gbd0e4e1 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc tyr spawn 120 uname -a SunOS tyr.informatik.hs-fulda.de 5.10 Generic_150400-11 sun4u sparc SUNW,A70 Solaris tyr spawn 121 mpiexec -np 1 --host tyr,tyr,tyr,tyr spawn_multiple_master Parent process 0 running on tyr.informatik.hs-fulda.de I create 3 slave processes. [tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c at line 829 [tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c at line 2176 [tyr:27288] *** An error occurred in MPI_Comm_spawn_multiple [tyr:27288] *** reported by process [3434086401,0] [tyr:27288] *** on communicator MPI_COMM_WORLD [tyr:27288] *** MPI_ERR_SPAWN: could not spawn processes [tyr:27288] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [tyr:27288] ***and potentially your MPI job) tyr spawn 122 sunpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute" OPAL repo revision: v2.x-dev-1290-gbd0e4e1 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc sunpc1 fd1026 106 uname -a SunOS sunpc1 5.10 Generic_147441-21 i86pc i386 i86pc Solaris sunpc1 fd1026 107 mpiexec -np 1 --host sunpc1,sunpc1,sunpc1,sunpc1 spawn_multiple_master Parent process 0 running on sunpc1 I create 3 slave processes. [sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c at line 829 [sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c at line 2176 [sunpc1:370] *** An error occurred in MPI_Comm_spawn_multiple [sunpc1:370] *** reported by process [43909121,0] [sunpc1:370] *** on communicator MPI_COMM_WORLD [sunpc1:370] *** MPI_ERR_SPAWN: could not spawn processes [sunpc1:370] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [sunpc1:370] ***and potentially your MPI job) sunpc1 fd1026 108 linpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute" OPAL repo revision: v2.x-dev-1290-gbd0e4e1 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc linpc1 fd1026 106 uname -a Linux linpc1 3.1.10-1.29-desktop #1 SMP PREEMPT Fri May 31 20:10:04 UTC 2013 (2529847) x86_64 x86_64 x86_64 GNU/Linux linpc1 fd1026 107 mpiexec -np 1 --host linpc1,linpc1,linpc1,linpc1 spawn_multiple_master Parent process 0 running on linpc1 I create 3 slave processes. [linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c at line 829 [linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c at line 2176 [linpc1:21507] *** An error occurred in MPI_Comm_spawn_multiple [linpc1:21507] *** reported by process [1005518849,0] [linpc1:21507] *** on communicator MPI_COMM_WORLD [linpc1:21507] *** MPI_ERR_SPAWN: could not spawn processes [linpc1:21507] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [linpc1:21507] ***and potentially your MPI job) linpc1 fd1026 108 I used the following configure command. ../openmpi-v2.x-dev-1290-gbd0e4e1/configure \ --prefix=/usr/local/openmpi-2.0.0_64_gcc \ --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ --with-jdk-headers=/usr/local/jdk1.8.0/include \ JAVA_HOME=/usr/local/jdk1.8.0 \ LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ CPP="cpp" CXXCPP="cpp" \ --enable-mpi-cxx \ --enable-cxx-exceptions \ --enable-mpi-java \ --enable-heterogeneous \ --enable-mpi-thread-multiple \ --with-hwloc=internal \ --without-verbs \ --with-wrapper-cflags="-std=c11 -m64" \ --with-wrapper-cxxflags="-m64" \ --with-wrapper-fcflags="-m64" \ --enable-debug \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc Kind regards Siegmar Am 27.04.2016 um 13:21 schrieb Gilles Gouaillardet: Siegmar, please add this to your CFLAGS for the time being. configure tries to detect which flags must be added for C99 support, and it seems the test is not working for Solaris 10 and Oracle compilers. this is no more a widely used environment, and I am not sure I can find the time to fix this in a near future. regarding the runtime issue, can you please describe your 4 hosts (os, endian
Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8
Siegmar, can you please also post the source of spawn_slave ? Cheers, Gilles On 4/28/2016 1:17 AM, Siegmar Gross wrote: Hi Gilles, it is not necessary to have a heterogeneous environment to reproduce the error as you can see below. All machines are 64 bit. tyr spawn 119 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute" OPAL repo revision: v2.x-dev-1290-gbd0e4e1 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc tyr spawn 120 uname -a SunOS tyr.informatik.hs-fulda.de 5.10 Generic_150400-11 sun4u sparc SUNW,A70 Solaris tyr spawn 121 mpiexec -np 1 --host tyr,tyr,tyr,tyr spawn_multiple_master Parent process 0 running on tyr.informatik.hs-fulda.de I create 3 slave processes. [tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c at line 829 [tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c at line 2176 [tyr:27288] *** An error occurred in MPI_Comm_spawn_multiple [tyr:27288] *** reported by process [3434086401,0] [tyr:27288] *** on communicator MPI_COMM_WORLD [tyr:27288] *** MPI_ERR_SPAWN: could not spawn processes [tyr:27288] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [tyr:27288] ***and potentially your MPI job) tyr spawn 122 sunpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute" OPAL repo revision: v2.x-dev-1290-gbd0e4e1 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc sunpc1 fd1026 106 uname -a SunOS sunpc1 5.10 Generic_147441-21 i86pc i386 i86pc Solaris sunpc1 fd1026 107 mpiexec -np 1 --host sunpc1,sunpc1,sunpc1,sunpc1 spawn_multiple_master Parent process 0 running on sunpc1 I create 3 slave processes. [sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c at line 829 [sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c at line 2176 [sunpc1:370] *** An error occurred in MPI_Comm_spawn_multiple [sunpc1:370] *** reported by process [43909121,0] [sunpc1:370] *** on communicator MPI_COMM_WORLD [sunpc1:370] *** MPI_ERR_SPAWN: could not spawn processes [sunpc1:370] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [sunpc1:370] ***and potentially your MPI job) sunpc1 fd1026 108 linpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute" OPAL repo revision: v2.x-dev-1290-gbd0e4e1 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc linpc1 fd1026 106 uname -a Linux linpc1 3.1.10-1.29-desktop #1 SMP PREEMPT Fri May 31 20:10:04 UTC 2013 (2529847) x86_64 x86_64 x86_64 GNU/Linux linpc1 fd1026 107 mpiexec -np 1 --host linpc1,linpc1,linpc1,linpc1 spawn_multiple_master Parent process 0 running on linpc1 I create 3 slave processes. [linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c at line 829 [linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file ../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c at line 2176 [linpc1:21507] *** An error occurred in MPI_Comm_spawn_multiple [linpc1:21507] *** reported by process [1005518849,0] [linpc1:21507] *** on communicator MPI_COMM_WORLD [linpc1:21507] *** MPI_ERR_SPAWN: could not spawn processes [linpc1:21507] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [linpc1:21507] ***and potentially your MPI job) linpc1 fd1026 108 I used the following configure command. ../openmpi-v2.x-dev-1290-gbd0e4e1/configure \ --prefix=/usr/local/openmpi-2.0.0_64_gcc \ --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ --with-jdk-headers=/usr/local/jdk1.8.0/include \ JAVA_HOME=/usr/local/jdk1.8.0 \ LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ CPP="cpp" CXXCPP="cpp" \ --enable-mpi-cxx \ --enable-cxx-exceptions \ --enable-mpi-java \ --enable-heterogeneous \ --enable-mpi-thread-multiple \ --with-hwloc=internal \ --without-verbs \ --with-wrapper-cflags="-std=c11 -m64" \ --with-wrapper-cxxflags="-m64" \ --with-wrapper-fcflags="-m64" \ --enable-debug \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc Kind regards Siegmar Am 27.04.2016 um 13:21 schrieb Gilles Gouaillardet: Siegmar, please add this to your CFLAGS for the time being. configure tries to detect which flags must be added for C99 support, and it seems the test is not working for Solaris 10 and Oracle compilers. this is no more a widely used environment, and I am not sur