[OMPI users] mpicc fails to compile example code when --enable-static --disable-shared is used for installation.

2020-01-22 Thread Mehmet ÖREN via users
Background informationWhat version of Open MPI are you using? (e.g.,
v1.10.3, v2.1.0, git branch name and hash, etc.)

3.1.5 Current stable release from Nov 15, 2019

4.0.2 Current stable release from Oct 07, 2019
Describe how Open MPI was installed (e.g., from a source/distribution
tarball, from a git clone, from an operating system distribution package,
etc.)

source tarball
Please describe the system on which you are running

   - Operating system/version: Centos 7.2, RHEL 7.7
   - Computer hardware: Intel
   - Network type: Cisco USNIC

Resource manager: IBM Spectrum LSF 10.2

*Details of the problem*
If --enable-static and --disable-shared used when configuring Open MPI on
Cisco UCS C240 M4SX machines, mpicc is unable to compile the ring_c (test
program for validating MPI connectivity). The following configuration was
used to build ompi from source code.

$> ./configure --prefix=/gpfs/app/openmpi-4.0.2 CC=gcc CXX=g++
--enable-static --disable-shared
--with-lsf=/gpfs/data/soft/lsf/10.1/linux2.6-glibc2.3-x86_64 --with-usnic
--with-libfabric=/opt/cisco/libfabric --without-memory-manager
--enable-mpirun-prefix-by-default
--enable-mca-no-build=btl-openib,common-verbs,oob-ud LDFLAGS="-Wl,-rpath
-Wl,opt/cisco/libfabric/lib -Wl,--enable-new-dtags"

Installation completed without any errors for both versions of ompi.
Running the following code resulted with error.

$> mpicc ring_c.c -o ring_c

#/usr/bin/ld:
/gpfs/app/openmpi-402-test/lib/libopen-pal.a(reachable_netlink_utils_common.o):
undefined reference to symbol 'nlmsg_set_proto'
/usr/bin/ld: note: 'nlmsg_set_proto' is defined in DSO
/lib64/libnl-3.so.200 so try adding it to the linker command line
/lib64/libnl-3.so.200: could not read symbols: Invalid operation#

Error reproduced with same configuration on Centos7.2 and RHEL7.7 nodes.
Also I have tested older versions with same configuration.

Openmpi-2.0.4 -- Could not reproduce error
Openmpi-3.1.0 -- Could not reproduce error
Openmpi-3.1.5 -- Compilation error
Openmpi-4.0.2 -- Compilation error

After building and installing ompi without --enable-static --disable-shared
options, compilation was successful and I could validate the MPI
connectivity with the rinc_c.

NOTE: I could not check mail archive due to connection problem. It seems
that our IP address is blocked. So if it's a duplicate bug report please
ignore it.

Regards.

-- 

*Mehmet OREN*

Istanbul Medipol University

HPC System Administrator



t: +90 216 681 1583

t: +90 216 681 1500 Ext:1583

f: +90 212 521 2377


www.medipol.edu.tr


Re: [OMPI users] mpicc fails to compile example code when --enable-static --disable-shared is used for installation.

2020-01-22 Thread Jeff Squyres (jsquyres) via users
Greetings Mehmet.

I'm curious: why do you want to use static linking?  It tends to cause 
complications and issues like this.

Specifically: Open MPI's `--disable-shared --enable-static` switches means that 
Open MPI will produce static libraries instead of shared libraries (e.g., 
libmpi.a instead of libmpi.so -- but there are other libraries as well).  It 
also means that all of Open MPI's plugins are pulled back into libmpi (etc.) 
instead of being opened at run time.  Note that there are some costs to this at 
run-time.  For example, each process will have their own full copy of all of 
the Open MPI libraries and plugins loaded into their own, unique process space. 
 This is as opposed to the shared library + run-time loadable plugins case, 
where all the MPI processes on a single server will share the memory for the 
Open MPI shared libraries and the run-time loadable plugins.

If you really need static linking, you will need to add a few more parameters 
to your linking command line.  For example, this worked for me on RHEL 7.6:

mpicc hello_c.c -o hello_c -L/opt/cisco/libfabric/lib -lfabric -lnl-3

Remember that mpicc is just a "wrapper" compiler, in that it just adds in a 
bunch of command line parameters and then invokes the underlying compiler.  You 
can see what it does via the "--showme" CLI option (I installed my copy of Open 
MPI into /home/jsquyres/bogus):

$ mpicc hello_c.c -o hello_c -L/opt/cisco/libfabric/lib -lfabric -lnl-3 --showme
gcc hello_c.c -o hello -L/opt/cisco/libfabric/lib -lfabric -lnl-3 
-I/home/jsquyres/bogus/include -pthread -Wl,-rpath -Wl,/home/jsquyres/bogus/lib 
-Wl,--enable-new-dtags -L/home/jsquyres/bogus/lib -lmpi -lopen-rte -lopen-pal 
-lm -ldl -lz -lrt -lutil

You can see how mpicc copied "hello_c.c -o hello_c -L/opt/cisco/libfabric/lib 
-lfabric -lnl-3" to the beginning and added a whole pile of options after that.

If I had written that gcc line myself, I would have put the -lfabric -lnl-3 at 
the end, with the other libraries, perhaps something like this:

$ gcc hello_c.c -o hello_c -I/home/jsquyres/bogus/include -pthread -Wl,-rpath 
-Wl,/home/jsquyres/bogus/lib -Wl,--enable-new-dtags -L/home/jsquyres/bogus/lib 
-L/opt/cisco/libfabric/lib  -lmpi -lopen-rte -lopen-pal -lm -ldl -lz -lrt 
-lutil -lfabric -lnl-3

Also note that even if Open MPI's libraries are static, the others are not.  So 
the above command line still results in a hello_c executable that links against 
some shared libraries:

$ ldd hello_c
linux-vdso.so.1 =>  (0x2aacd000)
libm.so.6 => /lib64/libm.so.6 (0x2accf000)
libdl.so.2 => /lib64/libdl.so.2 (0x2afd1000)
libz.so.1 => /lib64/libz.so.1 (0x2b1d5000)
librt.so.1 => /lib64/librt.so.1 (0x2b3eb000)
libutil.so.1 => /lib64/libutil.so.1 (0x2b5f3000)
libfabric.so.1 => /opt/cisco/libfabric/lib/libfabric.so.1 
(0x2b7f6000)
libnl-3.so.200 => /lib64/libnl-3.so.200 (0x2ba57000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x2bc78000)
libc.so.6 => /lib64/libc.so.6 (0x2be94000)
/lib64/ld-linux-x86-64.so.2 (0x2aaab000)
libnl.so.1 => /lib64/libnl.so.1 (0x2c261000)
libatomic.so.1 => /cm/shared/apps/gcc/8.2.0/lib/../lib64/libatomic.so.1 
(0x2c4b4000)

NOTE: You can try using "-static" to get a fully, 100% statically-linked 
"hello_c" executable, but I don't know if RHEL7 has static versions of all of 
those libraries.  And depending on your goals, it may not be worth it to go 
down this road.

It's sorta complicated to explain (ask me if you care / want more detail), but 
the general reason why the wrapper compilers behave this way is that they are 
geared towards shared library linkage.  If you want more static linkage than 
this, you can use some of the alternate mechanisms to compile Open MPI 
applications -- see the FAQ:

https://www.open-mpi.org/faq/?category=mpi-apps#cant-use-wrappers
https://www.open-mpi.org/faq/?category=mpi-apps#default-wrapper-compiler-flags
https://www.open-mpi.org/faq/?category=mpi-apps#static-mpi-apps
https://www.open-mpi.org/faq/?category=mpi-apps#static-ofa-mpi-apps

It's not listed on the FAQ (oops!), but Open MPI also installs pkg-config files 
if you want to retrieve the relevant compiler / linker flags that way:

$ export PKG_CONFIG_PATH=$bogus/lib/pkgconfig
$ pkg-config ompi-c --cflags
-pthread -I/home/jsquyres/bogus/include 
...etc.



> On Jan 22, 2020, at 9:17 AM, Mehmet ÖREN via users  
> wrote:
> 
> 
> Background information
> 
> What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch 
> name and hash, etc.)
> 
> 3.1.5 Current stable release from Nov 15, 2019
> 
> 4.0.2 Current stable release from Oct 07, 2019
> 
> Describe how Open MPI was installed (e.g., from a source/distribution 
> tarball, from a git clone, from an operating system distribution package, 
> etc.)
> 
> source tarball
> 
> Please desc