Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8

2016-04-27 Thread Gilles Gouaillardet

Siegmar,


here is the error :

configure:17969: cc -o conftest -m64 -D_REENTRANT -g  -g 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/include 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/include 
-D_REENTRANT 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/hwloc/hwloc1112/hwloc/include 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/hwloc/hwloc1112/hwloc/include 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent/include 
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/event/libevent2022/libevent/include 
-m64 conftest.c  >&5
"/usr/include/stdbool.h", line 42: #error: "Use of  is valid 
only in a c99 compilation environment."



i cannot reproduce this on solaris 11 with oracle studio 5.3 compiler, 
and i do not have solaris 10 yet.


could you please re-configure with '-std=c99' appended to your CFLAGS 
and see if it helps ?



Cheers,


Gilles


On 4/26/2016 7:57 PM, Siegmar Gross wrote:

Hi Gilles and Ralph,

I was able to sort out my mess. In my last email I compared the
files from "SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi" from
the attachment of my email to Ralph with the files from
"SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi" from my current
file system. That's the reason while I have had different
timestamps. The other problem was that Ralph didn't recognize
that "mca_pmix_pmix112.so" wasn't built on Solaris with the
Sun C compiler. I've removed most of the files from the attachment
of my email so that it is easier to see the relevant files. Below
I try to give you more information that may be relevant to track
down the problem. I still get an error running one of my small
test programs, when I use my gcc-version of Open MPI.
"mca_pmix_pmix112.so" is a 64 bits library.

Linux_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  261327 Apr 19 16:46 mca_plm_slurm.so
-rwxr-xr-x 1 root root1002 Apr 19 16:45 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 3906526 Apr 19 16:45 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 966 Apr 19 16:51 mca_pml_cm.la
-rwxr-xr-x 1 root root 1574265 Apr 19 16:51 mca_pml_cm.so
...

Linux_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi:
...
-rwxr-xr-x 1 root root   70371 Apr 19 16:43 mca_plm_slurm.so
-rwxr-xr-x 1 root root1008 Apr 19 16:42 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 1029005 Apr 19 16:42 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 972 Apr 19 16:46 mca_pml_cm.la
-rwxr-xr-x 1 root root  284858 Apr 19 16:46 mca_pml_cm.so
...

SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  319816 Apr 19 19:58 mca_plm_rsh.so
-rwxr-xr-x 1 root root 970 Apr 19 20:00 mca_pml_cm.la
-rwxr-xr-x 1 root root 1507440 Apr 19 20:00 mca_pml_cm.so
...

SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  153280 Apr 19 19:49 mca_plm_rsh.so
-rwxr-xr-x 1 root root1007 Apr 19 19:47 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 1400512 Apr 19 19:47 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 971 Apr 19 19:52 mca_pml_cm.la
-rwxr-xr-x 1 root root  342440 Apr 19 19:52 mca_pml_cm.so
...

SunOS_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  300096 Apr 19 17:18 mca_plm_rsh.so
-rwxr-xr-x 1 root root 970 Apr 19 17:23 mca_pml_cm.la
-rwxr-xr-x 1 root root 1458816 Apr 19 17:23 mca_pml_cm.so
...

SunOS_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  133096 Apr 19 17:42 mca_plm_rsh.so
-rwxr-xr-x 1 root root1007 Apr 19 17:41 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 1320240 Apr 19 17:41 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 971 Apr 19 17:46 mca_pml_cm.la
-rwxr-xr-x 1 root root  419848 Apr 19 17:46 mca_pml_cm.so
...


Yesterday I've installed openmpi-v2.x-dev-1290-gbd0e4e1 so that we
have a current version for the investigation of the problem. Once
more mca_pmix_pmix112.so isn't available on Solaris if I use the
Sun C compiler.

"config.log" for gcc-5.1.0 shows the following.

...
configure:127799: /bin/bash 
'../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/

pmix/configure' succeeded for opal/mca/pmix/pmix112/pmix
configure:127916: checking if MCA component pmix:pmix112 can compile
configure:127918: result: yes
configure:5637: --- MCA component pmix:external (m4 configuration macro)
configure:128523: checking for MCA component pmix:external compile mode
configure:128529: result: dso
configure:129054: checking if MCA component pmix:external can compile
configure:129056: result: no
...
config.status:3897: creating opal/mca/pmix/Makefile
config.status:3897: creating opal/mca/pmix/s1/Makefile
config.status:38

Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8

2016-04-27 Thread Siegmar Gross

Hi Gilles,

adding "-std=c99" to CFLAGS solves the problem with the missing library.
Shall I add it permanently to my configure command or will you add it,
so that I will not run into problems if you need the C11 standard later?

"spawn_multiple_master" breaks with the same error that I reported
yesterday for my gcc-version of Open MPI. Hopefully you can solve the
problem as well.


Kind regards and thank you very much for your help

Siegmar


Am 27.04.2016 um 08:05 schrieb Gilles Gouaillardet:

Siegmar,


here is the error :

configure:17969: cc -o conftest -m64 -D_REENTRANT -g  -g
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/include
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/include
-D_REENTRANT
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/hwloc/hwloc1112/hwloc/include
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/hwloc/hwloc1112/hwloc/include
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent/include
-I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/event/libevent2022/libevent/include
-m64 conftest.c  >&5
"/usr/include/stdbool.h", line 42: #error: "Use of  is valid only
in a c99 compilation environment."


i cannot reproduce this on solaris 11 with oracle studio 5.3 compiler, and i
do not have solaris 10 yet.

could you please re-configure with '-std=c99' appended to your CFLAGS and see
if it helps ?


Cheers,


Gilles


On 4/26/2016 7:57 PM, Siegmar Gross wrote:

Hi Gilles and Ralph,

I was able to sort out my mess. In my last email I compared the
files from "SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi" from
the attachment of my email to Ralph with the files from
"SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi" from my current
file system. That's the reason while I have had different
timestamps. The other problem was that Ralph didn't recognize
that "mca_pmix_pmix112.so" wasn't built on Solaris with the
Sun C compiler. I've removed most of the files from the attachment
of my email so that it is easier to see the relevant files. Below
I try to give you more information that may be relevant to track
down the problem. I still get an error running one of my small
test programs, when I use my gcc-version of Open MPI.
"mca_pmix_pmix112.so" is a 64 bits library.

Linux_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  261327 Apr 19 16:46 mca_plm_slurm.so
-rwxr-xr-x 1 root root1002 Apr 19 16:45 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 3906526 Apr 19 16:45 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 966 Apr 19 16:51 mca_pml_cm.la
-rwxr-xr-x 1 root root 1574265 Apr 19 16:51 mca_pml_cm.so
...

Linux_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi:
...
-rwxr-xr-x 1 root root   70371 Apr 19 16:43 mca_plm_slurm.so
-rwxr-xr-x 1 root root1008 Apr 19 16:42 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 1029005 Apr 19 16:42 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 972 Apr 19 16:46 mca_pml_cm.la
-rwxr-xr-x 1 root root  284858 Apr 19 16:46 mca_pml_cm.so
...

SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  319816 Apr 19 19:58 mca_plm_rsh.so
-rwxr-xr-x 1 root root 970 Apr 19 20:00 mca_pml_cm.la
-rwxr-xr-x 1 root root 1507440 Apr 19 20:00 mca_pml_cm.so
...

SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  153280 Apr 19 19:49 mca_plm_rsh.so
-rwxr-xr-x 1 root root1007 Apr 19 19:47 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 1400512 Apr 19 19:47 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 971 Apr 19 19:52 mca_pml_cm.la
-rwxr-xr-x 1 root root  342440 Apr 19 19:52 mca_pml_cm.so
...

SunOS_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  300096 Apr 19 17:18 mca_plm_rsh.so
-rwxr-xr-x 1 root root 970 Apr 19 17:23 mca_pml_cm.la
-rwxr-xr-x 1 root root 1458816 Apr 19 17:23 mca_pml_cm.so
...

SunOS_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi:
...
-rwxr-xr-x 1 root root  133096 Apr 19 17:42 mca_plm_rsh.so
-rwxr-xr-x 1 root root1007 Apr 19 17:41 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 1320240 Apr 19 17:41 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root 971 Apr 19 17:46 mca_pml_cm.la
-rwxr-xr-x 1 root root  419848 Apr 19 17:46 mca_pml_cm.so
...


Yesterday I've installed openmpi-v2.x-dev-1290-gbd0e4e1 so that we
have a current version for the investigation of the problem. Once
more mca_pmix_pmix112.so isn't available on Solaris if I use the
Sun C compiler.

"config.log" for gcc-5.1.0 shows the following.

...
configure:127799: /bin/bash
'../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/
pmix/configure' succeeded for opal/mca/pmix/pmix112/pmix
configure:127916: 

[OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8

2016-04-27 Thread Gilles Gouaillardet
Siegmar,

please add this to your CFLAGS for the time being.

configure tries to detect which flags must be added for C99 support, and it
seems
the test is not working for Solaris 10 and Oracle compilers.
this is no more a widely used environment, and I am not sure I can find the
time to fix this
in a near future.


regarding the runtime issue, can you please describe your 4 hosts (os,
endianness and bitness)

Cheers,

Gilles

On Wednesday, April 27, 2016, Siegmar Gross <
siegmar.gr...@informatik.hs-fulda.de
>
wrote:

> Hi Gilles,
>
> adding "-std=c99" to CFLAGS solves the problem with the missing library.
> Shall I add it permanently to my configure command or will you add it,
> so that I will not run into problems if you need the C11 standard later?
>
> "spawn_multiple_master" breaks with the same error that I reported
> yesterday for my gcc-version of Open MPI. Hopefully you can solve the
> problem as well.
>
>
> Kind regards and thank you very much for your help
>
> Siegmar
>
>
> Am 27.04.2016 um 08:05 schrieb Gilles Gouaillardet:
>
>> Siegmar,
>>
>>
>> here is the error :
>>
>> configure:17969: cc -o conftest -m64 -D_REENTRANT -g  -g
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1
>>
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/include
>>
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/include
>> -D_REENTRANT
>>
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/hwloc/hwloc1112/hwloc/include
>>
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/hwloc/hwloc1112/hwloc/include
>>
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent
>>
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/event/libevent2022/libevent/include
>>
>> -I/export2/src/openmpi-2.0.0/openmpi-v2.x-dev-1290-gbd0e4e1-SunOS.sparc.64_cc/opal/mca/event/libevent2022/libevent/include
>> -m64 conftest.c  >&5
>> "/usr/include/stdbool.h", line 42: #error: "Use of  is valid
>> only
>> in a c99 compilation environment."
>>
>>
>> i cannot reproduce this on solaris 11 with oracle studio 5.3 compiler,
>> and i
>> do not have solaris 10 yet.
>>
>> could you please re-configure with '-std=c99' appended to your CFLAGS and
>> see
>> if it helps ?
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>> On 4/26/2016 7:57 PM, Siegmar Gross wrote:
>>
>>> Hi Gilles and Ralph,
>>>
>>> I was able to sort out my mess. In my last email I compared the
>>> files from "SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi" from
>>> the attachment of my email to Ralph with the files from
>>> "SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi" from my current
>>> file system. That's the reason while I have had different
>>> timestamps. The other problem was that Ralph didn't recognize
>>> that "mca_pmix_pmix112.so" wasn't built on Solaris with the
>>> Sun C compiler. I've removed most of the files from the attachment
>>> of my email so that it is easier to see the relevant files. Below
>>> I try to give you more information that may be relevant to track
>>> down the problem. I still get an error running one of my small
>>> test programs, when I use my gcc-version of Open MPI.
>>> "mca_pmix_pmix112.so" is a 64 bits library.
>>>
>>> Linux_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi:
>>> ...
>>> -rwxr-xr-x 1 root root  261327 Apr 19 16:46 mca_plm_slurm.so
>>> -rwxr-xr-x 1 root root1002 Apr 19 16:45 mca_pmix_pmix112.la
>>> -rwxr-xr-x 1 root root 3906526 Apr 19 16:45 mca_pmix_pmix112.so
>>> -rwxr-xr-x 1 root root 966 Apr 19 16:51 mca_pml_cm.la
>>> -rwxr-xr-x 1 root root 1574265 Apr 19 16:51 mca_pml_cm.so
>>> ...
>>>
>>> Linux_x86_64/openmpi-2.0.0_64_gcc/lib64/openmpi:
>>> ...
>>> -rwxr-xr-x 1 root root   70371 Apr 19 16:43 mca_plm_slurm.so
>>> -rwxr-xr-x 1 root root1008 Apr 19 16:42 mca_pmix_pmix112.la
>>> -rwxr-xr-x 1 root root 1029005 Apr 19 16:42 mca_pmix_pmix112.so
>>> -rwxr-xr-x 1 root root 972 Apr 19 16:46 mca_pml_cm.la
>>> -rwxr-xr-x 1 root root  284858 Apr 19 16:46 mca_pml_cm.so
>>> ...
>>>
>>> SunOS_sparc/openmpi-2.0.0_64_cc/lib64/openmpi:
>>> ...
>>> -rwxr-xr-x 1 root root  319816 Apr 19 19:58 mca_plm_rsh.so
>>> -rwxr-xr-x 1 root root 970 Apr 19 20:00 mca_pml_cm.la
>>> -rwxr-xr-x 1 root root 1507440 Apr 19 20:00 mca_pml_cm.so
>>> ...
>>>
>>> SunOS_sparc/openmpi-2.0.0_64_gcc/lib64/openmpi:
>>> ...
>>> -rwxr-xr-x 1 root root  153280 Apr 19 19:49 mca_plm_rsh.so
>>> -rwxr-xr-x 1 root root1007 Apr 19 19:47 mca_pmix_pmix112.la
>>> -rwxr-xr-x 1 root root 1400512 Apr 19 19:47 mca_pmix_pmix112.so
>>> -rwxr-xr-x 1 root root 971 Apr 19 19:52 mca_pml_cm.la
>>> -rwxr-xr-x 1 root root  342440 Apr 19 19:52 mca_pml_cm.so
>>> ...
>>>
>>> SunOS_x86_64/openmpi-2.0.0_64_cc/lib64/openmpi:
>>> ...
>>> -rwxr-xr-x 1 root root  300096 Apr 19 17:18 mca_plm_rsh.so
>>> -rwxr-xr-x 1 root root 970 Apr 19 17:

Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8

2016-04-27 Thread Siegmar Gross

Hi Gilles,

it is not necessary to have a heterogeneous environment to reproduce
the error as you can see below. All machines are 64 bit.

tyr spawn 119 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute"
  OPAL repo revision: v2.x-dev-1290-gbd0e4e1
 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc
tyr spawn 120 uname -a
SunOS tyr.informatik.hs-fulda.de 5.10 Generic_150400-11 sun4u sparc SUNW,A70 
Solaris

tyr spawn 121 mpiexec -np 1 --host tyr,tyr,tyr,tyr spawn_multiple_master

Parent process 0 running on tyr.informatik.hs-fulda.de
  I create 3 slave processes.

[tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c 
at line 829
[tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c 
at line 2176

[tyr:27288] *** An error occurred in MPI_Comm_spawn_multiple
[tyr:27288] *** reported by process [3434086401,0]
[tyr:27288] *** on communicator MPI_COMM_WORLD
[tyr:27288] *** MPI_ERR_SPAWN: could not spawn processes
[tyr:27288] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,

[tyr:27288] ***and potentially your MPI job)
tyr spawn 122






sunpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C compiler 
absolute"

  OPAL repo revision: v2.x-dev-1290-gbd0e4e1
 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc
sunpc1 fd1026 106 uname -a
SunOS sunpc1 5.10 Generic_147441-21 i86pc i386 i86pc Solaris
sunpc1 fd1026 107 mpiexec -np 1 --host sunpc1,sunpc1,sunpc1,sunpc1 
spawn_multiple_master


Parent process 0 running on sunpc1
  I create 3 slave processes.

[sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c 
at line 829
[sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c 
at line 2176

[sunpc1:370] *** An error occurred in MPI_Comm_spawn_multiple
[sunpc1:370] *** reported by process [43909121,0]
[sunpc1:370] *** on communicator MPI_COMM_WORLD
[sunpc1:370] *** MPI_ERR_SPAWN: could not spawn processes
[sunpc1:370] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,

[sunpc1:370] ***and potentially your MPI job)
sunpc1 fd1026 108





linpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C compiler 
absolute"

  OPAL repo revision: v2.x-dev-1290-gbd0e4e1
 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc
linpc1 fd1026 106 uname -a
Linux linpc1 3.1.10-1.29-desktop #1 SMP PREEMPT Fri May 31 20:10:04 UTC 2013 
(2529847) x86_64 x86_64 x86_64 GNU/Linux
linpc1 fd1026 107 mpiexec -np 1 --host linpc1,linpc1,linpc1,linpc1 
spawn_multiple_master


Parent process 0 running on linpc1
  I create 3 slave processes.

[linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c 
at line 829
[linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c 
at line 2176

[linpc1:21507] *** An error occurred in MPI_Comm_spawn_multiple
[linpc1:21507] *** reported by process [1005518849,0]
[linpc1:21507] *** on communicator MPI_COMM_WORLD
[linpc1:21507] *** MPI_ERR_SPAWN: could not spawn processes
[linpc1:21507] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
now abort,

[linpc1:21507] ***and potentially your MPI job)
linpc1 fd1026 108


I used the following configure command.

../openmpi-v2.x-dev-1290-gbd0e4e1/configure \
  --prefix=/usr/local/openmpi-2.0.0_64_gcc \
  --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \
  --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
  --with-jdk-headers=/usr/local/jdk1.8.0/include \
  JAVA_HOME=/usr/local/jdk1.8.0 \
  LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \
  CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \
  CPP="cpp" CXXCPP="cpp" \
  --enable-mpi-cxx \
  --enable-cxx-exceptions \
  --enable-mpi-java \
  --enable-heterogeneous \
  --enable-mpi-thread-multiple \
  --with-hwloc=internal \
  --without-verbs \
  --with-wrapper-cflags="-std=c11 -m64" \
  --with-wrapper-cxxflags="-m64" \
  --with-wrapper-fcflags="-m64" \
  --enable-debug \
  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc


Kind regards

Siegmar



Am 27.04.2016 um 13:21 schrieb Gilles Gouaillardet:

Siegmar,

please add this to your CFLAGS for the time being.

configure tries to detect which flags must be added for C99 support, and it 
seems
the test is not working for Solaris 10 and Oracle compilers.
this is no more a widely used environment, and I am not sure I can find the
time to fix this
in a near future.


regarding the runtime issue, can you please describe your 4 hosts (os,
endian

Re: [OMPI users] runtime errors for openmpi-v2.x-dev-1280-gc110ae8

2016-04-27 Thread Gilles Gouaillardet

Siegmar,


can you please also post the source of spawn_slave ?


Cheers,

Gilles


On 4/28/2016 1:17 AM, Siegmar Gross wrote:

Hi Gilles,

it is not necessary to have a heterogeneous environment to reproduce
the error as you can see below. All machines are 64 bit.

tyr spawn 119 ompi_info | grep -e "OPAL repo revision" -e "C compiler 
absolute"

  OPAL repo revision: v2.x-dev-1290-gbd0e4e1
 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc
tyr spawn 120 uname -a
SunOS tyr.informatik.hs-fulda.de 5.10 Generic_150400-11 sun4u sparc 
SUNW,A70 Solaris

tyr spawn 121 mpiexec -np 1 --host tyr,tyr,tyr,tyr spawn_multiple_master

Parent process 0 running on tyr.informatik.hs-fulda.de
  I create 3 slave processes.

[tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c 
at line 829
[tyr.informatik.hs-fulda.de:27286] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c 
at line 2176

[tyr:27288] *** An error occurred in MPI_Comm_spawn_multiple
[tyr:27288] *** reported by process [3434086401,0]
[tyr:27288] *** on communicator MPI_COMM_WORLD
[tyr:27288] *** MPI_ERR_SPAWN: could not spawn processes
[tyr:27288] *** MPI_ERRORS_ARE_FATAL (processes in this communicator 
will now abort,

[tyr:27288] ***and potentially your MPI job)
tyr spawn 122






sunpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C 
compiler absolute"

  OPAL repo revision: v2.x-dev-1290-gbd0e4e1
 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc
sunpc1 fd1026 106 uname -a
SunOS sunpc1 5.10 Generic_147441-21 i86pc i386 i86pc Solaris
sunpc1 fd1026 107 mpiexec -np 1 --host sunpc1,sunpc1,sunpc1,sunpc1 
spawn_multiple_master


Parent process 0 running on sunpc1
  I create 3 slave processes.

[sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c 
at line 829
[sunpc1:00368] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c 
at line 2176

[sunpc1:370] *** An error occurred in MPI_Comm_spawn_multiple
[sunpc1:370] *** reported by process [43909121,0]
[sunpc1:370] *** on communicator MPI_COMM_WORLD
[sunpc1:370] *** MPI_ERR_SPAWN: could not spawn processes
[sunpc1:370] *** MPI_ERRORS_ARE_FATAL (processes in this communicator 
will now abort,

[sunpc1:370] ***and potentially your MPI job)
sunpc1 fd1026 108





linpc1 fd1026 105 ompi_info | grep -e "OPAL repo revision" -e "C 
compiler absolute"

  OPAL repo revision: v2.x-dev-1290-gbd0e4e1
 C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc
linpc1 fd1026 106 uname -a
Linux linpc1 3.1.10-1.29-desktop #1 SMP PREEMPT Fri May 31 20:10:04 
UTC 2013 (2529847) x86_64 x86_64 x86_64 GNU/Linux
linpc1 fd1026 107 mpiexec -np 1 --host linpc1,linpc1,linpc1,linpc1 
spawn_multiple_master


Parent process 0 running on linpc1
  I create 3 slave processes.

[linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c 
at line 829
[linpc1:21502] PMIX ERROR: UNPACK-PAST-END in file 
../../../../../../openmpi-v2.x-dev-1290-gbd0e4e1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c 
at line 2176

[linpc1:21507] *** An error occurred in MPI_Comm_spawn_multiple
[linpc1:21507] *** reported by process [1005518849,0]
[linpc1:21507] *** on communicator MPI_COMM_WORLD
[linpc1:21507] *** MPI_ERR_SPAWN: could not spawn processes
[linpc1:21507] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,

[linpc1:21507] ***and potentially your MPI job)
linpc1 fd1026 108


I used the following configure command.

../openmpi-v2.x-dev-1290-gbd0e4e1/configure \
  --prefix=/usr/local/openmpi-2.0.0_64_gcc \
  --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \
  --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
  --with-jdk-headers=/usr/local/jdk1.8.0/include \
  JAVA_HOME=/usr/local/jdk1.8.0 \
  LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \
  CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \
  CPP="cpp" CXXCPP="cpp" \
  --enable-mpi-cxx \
  --enable-cxx-exceptions \
  --enable-mpi-java \
  --enable-heterogeneous \
  --enable-mpi-thread-multiple \
  --with-hwloc=internal \
  --without-verbs \
  --with-wrapper-cflags="-std=c11 -m64" \
  --with-wrapper-cxxflags="-m64" \
  --with-wrapper-fcflags="-m64" \
  --enable-debug \
  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc


Kind regards

Siegmar



Am 27.04.2016 um 13:21 schrieb Gilles Gouaillardet:

Siegmar,

please add this to your CFLAGS for the time being.

configure tries to detect which flags must be added for C99 support, 
and it seems

the test is not working for Solaris 10 and Oracle compilers.
this is no more a widely used environment, and I am not sur