Hi all,

we discussed this issue with Intel compiler support and it looks like they now know what the issue is and how to protect after. It is a known issue resulting from a backwards incompatibility in an OS/glibc update, cf. https://sourceware.org/bugzilla/show_bug.cgi?id=20019

Affected versions of the Intel compilers: 16.0.3, 16.0.4
Not affected versions: 16.0.2, 17.0

So, simply do not use affected versions (and hope on an bugfix update in 16x series if you cannot immediately upgrade to 17x, like we, despite this is the favourite option from Intel).

Have a nice Christmas time!

Paul Kapinos

On 12/14/16 13:29, Paul Kapinos wrote:
Hello all,
we seem to run into the same issue: 'mpif90' sigsegvs immediately for Open MPI
1.10.4 compiled using Intel compilers 16.0.4.258 and 16.0.3.210, while it works
fine when compiled with 16.0.2.181.

It seems to be a compiler issue (more exactly: library issue on libs delivered
with 16.0.4.258 and 16.0.3.210 versions). Changing the version of compiler
loaded back to 16.0.2.181 (=> change of dynamically loaded libs) let the
prevously-failing binary (compiled with newer compilers) to work propperly.

Compiling with -O0 does not help. As the issue is likely in the Intel libs (as
said changing out these solves/raises the issue) we will do a failback to
16.0.2.181 compiler version. We will try to open a case by Intel - let's see...

Have a nice day,

Paul Kapinos



On 05/06/16 14:10, Jeff Squyres (jsquyres) wrote:
Ok, good.

I asked that question because typically when we see errors like this, it is
usually either a busted compiler installation or inadvertently mixing the
run-times of multiple different compilers in some kind of incompatible way.
Specifically, the mpifort (aka mpif90) application is a fairly simple program
-- there's no reason it should segv, especially with a stack trace that you
sent that implies that it's dying early in startup, potentially even before it
has hit any Open MPI code (i.e., it could even be pre-main).

BTW, you might be able to get a more complete stack trace from the debugger
that comes with the Intel compiler (idb?  I don't remember offhand).

Since you are able to run simple programs compiled by this compiler, it sounds
like the compiler is working fine.  Good!

The next thing to check is to see if somehow the compiler and/or run-time
environments are getting mixed up.  E.g., the apps were compiled for one
compiler/run-time but are being used with another.  Also ensure that any
compiler/linker flags that you are passing to Open MPI's configure script are
native and correct for the platform for which you're compiling (e.g., don't
pass in flags that optimize for a different platform; that may result in
generating machine code instructions that are invalid for your platform).

Try recompiling/re-installing Open MPI from scratch, and if it still doesn't
work, then send all the information listed here:

    https://www.open-mpi.org/community/help/


On May 6, 2016, at 3:45 AM, Giacomo Rossi <giacom...@gmail.com> wrote:

Yes, I've tried three simple "Hello world" programs in fortan, C and C++ and
the compile and run with intel 16.0.3. The problem is with the openmpi
compiled from source.

Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza"
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers


2016-05-05 11:15 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
 gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90
GNU gdb (GDB) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90...(no
debugging symbols found)...done.
(gdb) r -v
Starting program: /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6858f38 in ?? ()
(gdb) bt
#0  0x00007ffff6858f38 in ?? ()
#1  0x00007ffff7de5828 in _dl_relocate_object () from
/lib64/ld-linux-x86-64.so.2
#2  0x00007ffff7ddcfa3 in dl_main () from /lib64/ld-linux-x86-64.so.2
#3  0x00007ffff7df029c in _dl_sysdep_start () from /lib64/ld-linux-x86-64.so.2
#4  0x00007ffff7dddd4a in _dl_start () from /lib64/ld-linux-x86-64.so.2
#5  0x00007ffff7dd9d98 in _start () from /lib64/ld-linux-x86-64.so.2
#6  0x0000000000000002 in ?? ()
#7  0x00007fffffffaa8a in ?? ()
#8  0x00007fffffffaab6 in ?? ()
#9  0x0000000000000000 in ?? ()

Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza"
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers


2016-05-05 10:44 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
Here the result of ldd command:
'ldd /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90
    linux-vdso.so.1 (0x00007ffcacbbe000)
    libopen-pal.so.13 =>
/opt/openmpi/1.10.2/intel/16.0.3/lib/libopen-pal.so.13 (0x00007fa9597a9000)
    libm.so.6 => /usr/lib/libm.so.6 (0x00007fa9594a4000)
    libpciaccess.so.0 => /usr/lib/libpciaccess.so.0 (0x00007fa95929a000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fa959096000)
    librt.so.1 => /usr/lib/librt.so.1 (0x00007fa958e8e000)
    libutil.so.1 => /usr/lib/libutil.so.1 (0x00007fa958c8b000)
    libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fa958a75000)
    libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa958858000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fa9584b7000)
    libimf.so =>
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libimf.so
(0x00007fa957fb9000)
    libsvml.so =>
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libsvml.so
(0x00007fa9570ad000)
    libirng.so =>
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libirng.so
(0x00007fa956d3b000)
    libintlc.so.5 =>
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libintlc.so.5
(0x00007fa956acf000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fa959ab9000)'

I can't provide a core file, because I can't compile or launch any program
with mpifort... I've always the error 'core dumped' also when I try to
compile a program with mpifort, and of course there isn't any core file.


Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza"
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers


2016-05-05 8:50 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
I’ve installed the latest version of Intel Parallel Studio (16.0.3), then
I’ve downloaded the latest version of openmpi (1.10.2) and I’ve compiled it with

`./configure CC=icc CXX=icpc F77=ifort FC=ifort
--prefix=/opt/openmpi/1.10.2/intel/16.0.3`

then I've installed and everything seems ok, but when I try the simple command

' /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v'

I receive the following error

'Segmentation fault (core dumped)'

I'm on ArchLinux, with kernel 4.5.1-1-ARCH; I've attache to this email the
config.log file compressed with bzip2.

Any help will be appreciated!



Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza"
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers






_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2016/05/29108.php






--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to