Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-22 Thread Gilles Gouaillardet
Was there an error in the copy/paste ?

The mpicc command should be
mpicc  /opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c

Cheers,

Gilles

On Fri, Sep 22, 2017 at 3:33 PM, Tim Jim  wrote:

> Thanks for the thoughts and comments. Here is the setup information:
> OpenMPI Ver. 3.0.0. Please see attached for the compressed config.log and
> ompi_info --all call.
> In this compile, my install steps were:
> 1. declared  "export nvml_enable=no" and "export enable_opencl=no" in the
> terminal
> and the rest as seen in the logs:
> 2. ./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0
> 3. make all install
>
> I ultimately would like CUDA to be utilised if it can speed up my
> computation time - should I still attempt to get openMPI working without
> CUDA first?
>
> Thanks for the heads up about compiling the executables first - I tried
> mpicc again with the compiled version but got the following output:
>
> tjim@DESKTOP-TA3P0PS:~/Documents$ mpicc /opt/openmpi/openmpi-3.0.0_
> src/examples/hello_c
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_start':
> (.text+0x0): multiple definition of `_start'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.text+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_fini':
> (.fini+0x0): multiple definition of `_fini'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.fini+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.rodata+0x0): multiple
> definition of `_IO_stdin_used'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.rodata.cst4+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `data_start':
> (.data+0x0): multiple definition of `__data_start'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.data+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `data_start':
> (.data+0x8): multiple definition of `__dso_handle'
> /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o:(.data+0x0): first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_init':
> (.init+0x0): multiple definition of `_init'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.init+0x0):
> first defined here
> /usr/lib/gcc/x86_64-linux-gnu/5/crtend.o:(.tm_clone_table+0x0): multiple
> definition of `__TMC_END__'
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.data+0x10): first
> defined here
> /usr/bin/ld: error in 
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c(.eh_frame);
> no .eh_frame_hdr table will be created.
> collect2: error: ld returned 1 exit status
>
> Is this due to a failed install?
> Regards,
> Tim
>
>
> On 22 September 2017 at 01:10, Sylvain Jeaugey 
> wrote:
>
>> The issue is related to openCL, not NVML.
>>
>> So the correct export would be "export enable_opencl=no" (you may want to
>> "export enable_nvml=no" as well).
>>
>>
>> On 09/21/2017 12:32 AM, Tim Jim wrote:
>>
>> Hi,
>>
>> I tried as you suggested: export nvml_enable=no, then reconfigured and
>> ran make all install again, but mpicc is still producing the same error.
>> What should I try next?
>>
>> Many thanks,
>> Tim
>>
>> On 21 September 2017 at 16:12, Gilles Gouaillardet 
>> wrote:
>>
>>> Tim,
>>>
>>>
>>> do that in your shell, right before invoking configure.
>>>
>>> export nvml_enable=no
>>>
>>> ./configure ...
>>>
>>> make && make install
>>>
>>>
>>> you can keep the --without-cuda flag (i think this is unrelated though)
>>>
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On 9/21/2017 3:54 PM, Tim Jim wrote:
>>>
 Dear Gilles,

 Thanks for the mail - where should I set export nvml_enable=no? Should
 I reconfigure with default cuda support or keep the --without-cuda flag?

 Kind regards,
 Tim

 On 21 September 2017 at 15:22, Gilles Gouaillardet >>> > wrote:

 Tim,


 i am not familiar with CUDA, but that might help

 can you please

 export nvml_enable=no

 and then re-configure and rebuild Open MPI ?


 i hope this will help you


 Cheers,


 Gilles



 On 9/21/2017 3:04 PM, Tim Jim wrote:

 Hello,

 Apologies to bring up this old thread - I finally had a chance
 to try again with openmpi but I am still have trouble getting
 it to run. I downloaded version 3.0.0 hoping it would solve
 some of the problems but on running mpicc for the previous
 test case, I am still getting an undefined reference error. I
 did as you suggested and also configured it to install without
 cuda using

 ./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0

 and at the end of the summary, CUDA support shows 'no'.
 Unfortunate

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-22 Thread Tim Jim
Hi Gilles,

Yes, you're right. I wanted to double check the compile but didn't notice I
was pointing to the exec I compiled from a previous Make.

mpicc now seems to work, running mpirun hello_c gives:

Hello, world, I am 0 of 4, (Open MPI v3.0.0, package: Open MPI
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 12,
2017, 115)
Hello, world, I am 3 of 4, (Open MPI v3.0.0, package: Open MPI
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 12,
2017, 115)
Hello, world, I am 1 of 4, (Open MPI v3.0.0, package: Open MPI
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 12,
2017, 115)
Hello, world, I am 2 of 4, (Open MPI v3.0.0, package: Open MPI
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 12,
2017, 115)

- which I assume that means it's working?

Should I try a recompile with CUDA while declaring "export nvml_enable=no"
and "export enable_opencl=no"? What effects do these declarations have on
the normal functioning of mpi?

Many thanks.


On 22 September 2017 at 15:55, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Was there an error in the copy/paste ?
>
> The mpicc command should be
> mpicc  /opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c
>
> Cheers,
>
> Gilles
>
> On Fri, Sep 22, 2017 at 3:33 PM, Tim Jim  wrote:
>
>> Thanks for the thoughts and comments. Here is the setup information:
>> OpenMPI Ver. 3.0.0. Please see attached for the compressed config.log and
>> ompi_info --all call.
>> In this compile, my install steps were:
>> 1. declared  "export nvml_enable=no" and "export enable_opencl=no" in the
>> terminal
>> and the rest as seen in the logs:
>> 2. ./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0
>> 3. make all install
>>
>> I ultimately would like CUDA to be utilised if it can speed up my
>> computation time - should I still attempt to get openMPI working without
>> CUDA first?
>>
>> Thanks for the heads up about compiling the executables first - I tried
>> mpicc again with the compiled version but got the following output:
>>
>> tjim@DESKTOP-TA3P0PS:~/Documents$ mpicc /opt/openmpi/openmpi-3.0.0_src
>> /examples/hello_c
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_start':
>> (.text+0x0): multiple definition of `_start'
>> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.text+0x0):
>> first defined here
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_fini':
>> (.fini+0x0): multiple definition of `_fini'
>> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.fini+0x0):
>> first defined here
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.rodata+0x0): multiple
>> definition of `_IO_stdin_used'
>> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.rodata.cst4+0x0):
>> first defined here
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
>> `data_start':
>> (.data+0x0): multiple definition of `__data_start'
>> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.data+0x0):
>> first defined here
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
>> `data_start':
>> (.data+0x8): multiple definition of `__dso_handle'
>> /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o:(.data+0x0): first defined
>> here
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_init':
>> (.init+0x0): multiple definition of `_init'
>> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.init+0x0):
>> first defined here
>> /usr/lib/gcc/x86_64-linux-gnu/5/crtend.o:(.tm_clone_table+0x0): multiple
>> definition of `__TMC_END__'
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.data+0x10): first
>> defined here
>> /usr/bin/ld: error in 
>> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c(.eh_frame);
>> no .eh_frame_hdr table will be created.
>> collect2: error: ld returned 1 exit status
>>
>> Is this due to a failed install?
>> Regards,
>> Tim
>>
>>
>> On 22 September 2017 at 01:10, Sylvain Jeaugey 
>> wrote:
>>
>>> The issue is related to openCL, not NVML.
>>>
>>> So the correct export would be "export enable_opencl=no" (you may want
>>> to "export enable_nvml=no" as well).
>>>
>>>
>>> On 09/21/2017 12:32 AM, Tim Jim wrote:
>>>
>>> Hi,
>>>
>>> I tried as you suggested: export nvml_enable=no, then reconfigured and
>>> ran make all install again, but mpicc is still producing the same error.
>>> What should I try next?
>>>
>>> Many thanks,
>>> Tim
>>>
>>> On 21 September 2017 at 16:12, Gilles Gouaillardet 
>>> wrote:
>>>
 Tim,


 do that in your shell, right before invoking configure.

 export nvml_enable=no

 ./configure ...

 make && make install


 you can keep the --without-cuda flag (i think this is unrelated though)


 Cheers,

 Gilles

 On 9/21/2017 3:54 PM, Tim Jim wrote:

> Dear Gilles,
>
> Thanks for the mail - where should I set export nvml_enable=no? Should
>

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-22 Thread Gilles Gouaillardet

Great it is finally working !


nvml and opencl are only used by hwloc, and i do not think Open MPI is 
using these features,


so i suggest you go ahead, reconfigure and rebuild Open MPI and see how 
things go



Cheers,


Gilles


On 9/22/2017 4:59 PM, Tim Jim wrote:

Hi Gilles,

Yes, you're right. I wanted to double check the compile but didn't 
notice I was pointing to the exec I compiled from a previous Make.


mpicc now seems to work, running mpirun hello_c gives:

Hello, world, I am 0 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)
Hello, world, I am 3 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)
Hello, world, I am 1 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)
Hello, world, I am 2 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)


- which I assume that means it's working?

Should I try a recompile with CUDA while declaring "export 
nvml_enable=no" and "export enable_opencl=no"? What effects do these 
declarations have on the normal functioning of mpi?


Many thanks.


On 22 September 2017 at 15:55, Gilles Gouaillardet 
mailto:gilles.gouaillar...@gmail.com>> 
wrote:


Was there an error in the copy/paste ?

The mpicc command should be
mpicc  /opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c

Cheers,

Gilles

On Fri, Sep 22, 2017 at 3:33 PM, Tim Jim mailto:timothy.m@gmail.com>> wrote:

Thanks for the thoughts and comments. Here is the setup
information:
OpenMPI Ver. 3.0.0. Please see attached for the compressed
config.log and ompi_info --all call.
In this compile, my install steps were:
1. declared  "export nvml_enable=no" and "export
enable_opencl=no" in the terminal
and the rest as seen in the logs:
2. ./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0
3. make all install

I ultimately would like CUDA to be utilised if it can speed up
my computation time - should I still attempt to get openMPI
working without CUDA first?

Thanks for the heads up about compiling the executables first
- I tried mpicc again with the compiled version but got the
following output:

tjim@DESKTOP-TA3P0PS:~/Documents$ mpicc
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`_start':
(.text+0x0): multiple definition of `_start'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.text+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`_fini':
(.fini+0x0): multiple definition of `_fini'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.fini+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.rodata+0x0):
multiple definition of `_IO_stdin_used'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.rodata.cst4+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`data_start':
(.data+0x0): multiple definition of `__data_start'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.data+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`data_start':
(.data+0x8): multiple definition of `__dso_handle'
/usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o:(.data+0x0): first
defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`_init':
(.init+0x0): multiple definition of `_init'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.init+0x0):
first defined here
/usr/lib/gcc/x86_64-linux-gnu/5/crtend.o:(.tm_clone_table+0x0):
multiple definition of `__TMC_END__'
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.data+0x10):
first defined here
/usr/bin/ld: error in
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c(.eh_frame); no
.eh_frame_hdr table will be created.
collect2: error: ld returned 1 exit status

Is this due to a failed install?
Regards,
Tim


On 22 September 2017 at 01:10, Sylvain Jeaugey
mailto:sjeau...@nvidia.com>> wrote:

The issue is related to openCL, not NVML.

So the correct export would be "export enable_opencl=no"
(you may want to "export enable_nvml=no" as well).


On 09/21/2017 12:32 AM, Tim Jim wrote:

Hi,

I tried as you suggested: export 

[OMPI users] mpiexec hangs instead of exiting if a worker node dies

2017-09-22 Thread Tobias Pfeiffer
Hi,

I am currently trying to learn about fault tolerance in MPI so I
experimented a bit with what happens if I kill various components in my MPI
setup, but there are some unexpected hangs in some situations.

I use the following MPI script:

#!/usr/bin/env python

from mpi4py import MPI
import time
import sys
import os
import signal

comm = MPI.COMM_WORLD

for i in range(100):
print("Hello @ %d! I'm rank %d from %d running in total..." % (i,
comm.rank, comm.size))
time.sleep(2)
if comm.rank == 1 and i == 2:
os.system("pstree -p")
# TRY VARIOUS THINGS IN THE LINE BELOW
os.kill(os.getpid(), signal.SIGTERM)

comm.Barrier()

When I run the script above on three nodes, I see the following output:

Hello @ 0! I'm rank 0 from 3 running in total...
Hello @ 0! I'm rank 1 from 3 running in total...
Hello @ 0! I'm rank 2 from 3 running in total...
Hello @ 1! I'm rank 0 from 3 running in total...
Hello @ 1! I'm rank 1 from 3 running in total...
Hello @ 1! I'm rank 2 from 3 running in total...
Hello @ 2! I'm rank 0 from 3 running in total...
Hello @ 2! I'm rank 1 from 3 running in total...
Hello @ 2! I'm rank 2 from 3 running in total...
Hello @ 3! I'm rank 0 from 3 running in total...
Hello @ 3! I'm rank 2 from 3 running in total...

timeout(1)---sshd(8)---sshd(18)---orted(19)-+-python3(23)-+-sh(26)---pstree(27)
|
|-{python3}(24)
|
`-{python3}(25)
|-{orted}(20)
|-{orted}(21)
`-{orted}(22)
Hello @ 4! I'm rank 2 from 3 running in total...

--
mpiexec noticed that process rank 1 with PID 23 on node 8f528c301215
exited on signal 15 (Terminated).

--
[program exit]

(Note that each process runs in a Docker container, so these are in fact
all the processes visible to my program.)

This is nice, but if I want to know what happens if a node or the network
fails, then I also need to check other parts, so I changed
`os.kill(os.getpid(), signal.SIGTERM)` to `os.kill(1, signal.SIGTERM)` so
that all processes on that particular node die. I guess this is very
similar to what would happen if I reboot the system. The output is as
follows:

Hello @ 0! I'm rank 1 from 3 running in total...
Hello @ 0! I'm rank 0 from 3 running in total...
Hello @ 0! I'm rank 2 from 3 running in total...
Hello @ 1! I'm rank 1 from 3 running in total...
Hello @ 1! I'm rank 0 from 3 running in total...
Hello @ 1! I'm rank 2 from 3 running in total...
Hello @ 2! I'm rank 1 from 3 running in total...
Hello @ 2! I'm rank 0 from 3 running in total...
Hello @ 2! I'm rank 2 from 3 running in total...

timeout(1)---sshd(6)---sshd(16)---orted(17)-+-python3(21)-+-sh(24)---pstree(25)
|
|-{python3}(22)
|
`-{python3}(23)
|-{orted}(18)
|-{orted}(19)
`-{orted}(20)
Hello @ 3! I'm rank 1 from 3 running in total...
Connection to 43982adfb734 closed by remote host.

--
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp
(--tmpdir/orte_tmpdir_base).
  Please check with your sys admin to determine the correct location to
use.

*  compilation of the orted with dynamic libraries when static are
required
  (e.g., on Cray). Please check your configure cmd line and consider
using
  one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
  lack of common network interfaces and/or no route found between
  them. Please check network connectivity (including firewalls
  and network routing requirements).

--
Hello @ 3! I'm rank 0 from 3 running in total...
Hello @ 3! I'm rank 2 from 3 running in total...
Hello @ 4! I'm rank 2 from 3 running in total...
[program hangs]

I ran this several times and sometimes I would a

Re: [OMPI users] Libnl bug in openmpi v3.0.0?

2017-09-22 Thread Stephen Guzik
Yes, I can confirm that openmpi 3.0.0 builds without issue when
libnl-route-3-dev is installed.

Thanks,
Stephen

Stephen Guzik, Ph.D.
Assistant Professor, Department of Mechanical Engineering
Colorado State University

On 09/21/2017 12:55 AM, Gilles Gouaillardet wrote:
> Stephen,
>
>
> a simpler option is to install the libnl-route-3-dev.
>
> note you will not be able to build the reachable/netlink component
> without this package.
>
>
> Cheers,
>
>
> Gilles
>
>
> On 9/21/2017 1:04 PM, Gilles Gouaillardet wrote:
>> Stephen,
>>
>>
>> this is very likely related to the issue already reported in github.
>>
>> meanwhile, you can apply the attached patch
>>
>> patch configure < configure.diff
>>
>> and then re-configure and make.
>>
>> note this is a temporary workaround, it simply prevent the build of
>> the reachable/netlink component,
>> and the upcoming real fix will be able to build this component.
>>
>> Cheers,
>>
>> Gilles
>>
>> On 9/21/2017 9:22 AM, Gilles Gouaillardet wrote:
>>> Thanks for the report,
>>>
>>>
>>> is this related to https://github.com/open-mpi/ompi/issues/4211 ?
>>>
>>> there is a known issue when libnl-3 is installed but libnl-route-3
>>> is not
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>> On 9/21/2017 8:53 AM, Stephen Guzik wrote:
 When compiling (on Debian stretch), I see:

 In file included from libnl_utils.h:52:0,
   from reachable_netlink_utils_common.c:48:
 libnl1_utils.h:54:26: error: too few arguments to function
 ‘nl_geterror’
   #define NL_GETERROR(err) nl_geterror()
^
 libnl1_utils.h:80:5: note: in expansion of macro ‘NL_GETERROR’
   NL_GETERROR(err)); \
   ^~~
 reachable_netlink_utils_common.c:310:5: note: in expansion of macro
 ‘NL_RECVMSGS’
   NL_RECVMSGS(unlsk->nlh, arg, EHOSTUNREACH, err, out);
   ^~~
 In file included from /usr/include/libnl3/netlink/netlink.h:31:0,
   from libnl1_utils.h:47,
   from libnl_utils.h:52,
   from reachable_netlink_utils_common.c:48:
 /usr/include/libnl3/netlink/errno.h:56:21: note: declared here
   extern const char * nl_geterror(int);

 Modifying openmpi-3.0.0/opal/mca/reachable/netlink/libnl1_utils.h from

 #define NL_GETERROR(err) nl_geterror()

 to

 #define NL_GETERROR(err) nl_geterror(err)

 as in libnl3_utils.h allows for successful compilation.  But from
 configure, I see

 checking for libraries that use libnl v1... (none)
 checking for libraries that use libnl v3... ibverbs nl-3

 so I wonder if perhaps there is something more serious is going
 on.  Any
 suggestions?

 Thanks,
 Stephen Guzik

 ___
 users mailing list
 users@lists.open-mpi.org
 https://lists.open-mpi.org/mailman/listinfo/users
>>>
>>>
>>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Question concerning compatibility of languages used with building OpenMPI and languages OpenMPI uses to build MPI binaries.

2017-09-22 Thread Jeff Hammond
There is already a nice solution for the useful special case of ABI
portability where one wants to use more than one MPI library with an
application binary, but only one MPI library for a given application
invocation:

https://github.com/cea-hpc/wi4mpi

They document support for the Intel MPI and Open-MPI, which imply a much
larger support matrix when one takes https://www.mpich.org/abi/ into
account.  Plus, there is another popular MPICH derivative that is binary
compatible in my experience, even though it isn't listed there.

Hammond

On Thu, Sep 21, 2017 at 8:31 AM, Jeff Squyres (jsquyres) 
wrote:
>
> Don't forget that there's a lot more to "binary portability" between MPI
implementations than just the ABI (wire protocols, run-time interfaces,
...etc.).  This is the main (set of) reasons that ABI standardization of
the MPI specification never really took off -- so much would need to be
standardized that it could (would) remove a lot of optimization / value-add
that each MPI implementation (particularly those tuned for a specific
environment) can do specifically because such things are *not* standardized.
>
> I.e.: all things being equal, the optimization and performance benefits
that are achieved by real-world MPI implementations have been deemed more
important than binary compatibility.
>
>
> > On Sep 20, 2017, at 6:07 PM, Michael Thomadakis <
drmichaelt7...@gmail.com> wrote:
> >
> > This discussion started getting into an interesting question: ABI
standardization for portability by language. It makes sense to have ABI
standardization for portability of objects across environments. At the same
time it does mean that everyone follows the exact same recipe for low level
implementation details but there may be unnecessarily restrictive at times.
> >
> > On Wed, Sep 20, 2017 at 4:45 PM, Jeff Hammond 
wrote:
> >
> >
> > On Wed, Sep 20, 2017 at 5:55 AM, Dave Love 
wrote:
> > Jeff Hammond  writes:
> >
> > > Please separate C and C++ here. C has a standard ABI.  C++ doesn't.
> > >
> > > Jeff
> >
> > [For some value of "standard".]  I've said the same about C++, but the
> > current GCC manual says its C++ ABI is "industry standard", and at least
> > Intel document compatibility with recent GCC on GNU/Linux.  It's
> > standard enough to have changed for C++11 (?), with resulting grief in
> > package repos, for instance.
> >
> > I may have used imprecise language.  As a matter of practice, I switch
C compilers all the time without recompiling MPI and life is good.
Switching between Clang with libc++ and GCC with libstd++ does not produce
happiness.
> >
> > Jeff
> >
> > --
> > Jeff Hammond
> > jeff.scie...@gmail.com
> > http://jeffhammond.github.io/
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users




--
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users