Hi all,we discussed this issue with Intel compiler support and it looks like they now know what the issue is and how to protect after. It is a known issue resulting from a backwards incompatibility in an OS/glibc update, cf. https://sourceware.org/bugzilla/show_bug.cgi?id=20019
Affected versions of the Intel compilers: 16.0.3, 16.0.4 Not affected versions: 16.0.2, 17.0So, simply do not use affected versions (and hope on an bugfix update in 16x series if you cannot immediately upgrade to 17x, like we, despite this is the favourite option from Intel).
Have a nice Christmas time! Paul Kapinos On 12/14/16 13:29, Paul Kapinos wrote:
Hello all, we seem to run into the same issue: 'mpif90' sigsegvs immediately for Open MPI 1.10.4 compiled using Intel compilers 16.0.4.258 and 16.0.3.210, while it works fine when compiled with 16.0.2.181. It seems to be a compiler issue (more exactly: library issue on libs delivered with 16.0.4.258 and 16.0.3.210 versions). Changing the version of compiler loaded back to 16.0.2.181 (=> change of dynamically loaded libs) let the prevously-failing binary (compiled with newer compilers) to work propperly. Compiling with -O0 does not help. As the issue is likely in the Intel libs (as said changing out these solves/raises the issue) we will do a failback to 16.0.2.181 compiler version. We will try to open a case by Intel - let's see... Have a nice day, Paul Kapinos On 05/06/16 14:10, Jeff Squyres (jsquyres) wrote:Ok, good. I asked that question because typically when we see errors like this, it is usually either a busted compiler installation or inadvertently mixing the run-times of multiple different compilers in some kind of incompatible way. Specifically, the mpifort (aka mpif90) application is a fairly simple program -- there's no reason it should segv, especially with a stack trace that you sent that implies that it's dying early in startup, potentially even before it has hit any Open MPI code (i.e., it could even be pre-main). BTW, you might be able to get a more complete stack trace from the debugger that comes with the Intel compiler (idb? I don't remember offhand). Since you are able to run simple programs compiled by this compiler, it sounds like the compiler is working fine. Good! The next thing to check is to see if somehow the compiler and/or run-time environments are getting mixed up. E.g., the apps were compiled for one compiler/run-time but are being used with another. Also ensure that any compiler/linker flags that you are passing to Open MPI's configure script are native and correct for the platform for which you're compiling (e.g., don't pass in flags that optimize for a different platform; that may result in generating machine code instructions that are invalid for your platform). Try recompiling/re-installing Open MPI from scratch, and if it still doesn't work, then send all the information listed here: https://www.open-mpi.org/community/help/On May 6, 2016, at 3:45 AM, Giacomo Rossi <giacom...@gmail.com> wrote: Yes, I've tried three simple "Hello world" programs in fortan, C and C++ and the compile and run with intel 16.0.3. The problem is with the openmpi compiled from source. Giacomo Rossi Ph.D., Space Engineer Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com Member of Fortran-FOSS-programmers 2016-05-05 11:15 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 GNU gdb (GDB) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90...(no debugging symbols found)...done. (gdb) r -v Starting program: /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v Program received signal SIGSEGV, Segmentation fault. 0x00007ffff6858f38 in ?? () (gdb) bt #0 0x00007ffff6858f38 in ?? () #1 0x00007ffff7de5828 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2 #2 0x00007ffff7ddcfa3 in dl_main () from /lib64/ld-linux-x86-64.so.2 #3 0x00007ffff7df029c in _dl_sysdep_start () from /lib64/ld-linux-x86-64.so.2 #4 0x00007ffff7dddd4a in _dl_start () from /lib64/ld-linux-x86-64.so.2 #5 0x00007ffff7dd9d98 in _start () from /lib64/ld-linux-x86-64.so.2 #6 0x0000000000000002 in ?? () #7 0x00007fffffffaa8a in ?? () #8 0x00007fffffffaab6 in ?? () #9 0x0000000000000000 in ?? () Giacomo Rossi Ph.D., Space Engineer Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com Member of Fortran-FOSS-programmers 2016-05-05 10:44 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: Here the result of ldd command: 'ldd /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 linux-vdso.so.1 (0x00007ffcacbbe000) libopen-pal.so.13 => /opt/openmpi/1.10.2/intel/16.0.3/lib/libopen-pal.so.13 (0x00007fa9597a9000) libm.so.6 => /usr/lib/libm.so.6 (0x00007fa9594a4000) libpciaccess.so.0 => /usr/lib/libpciaccess.so.0 (0x00007fa95929a000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fa959096000) librt.so.1 => /usr/lib/librt.so.1 (0x00007fa958e8e000) libutil.so.1 => /usr/lib/libutil.so.1 (0x00007fa958c8b000) libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fa958a75000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa958858000) libc.so.6 => /usr/lib/libc.so.6 (0x00007fa9584b7000) libimf.so => /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libimf.so (0x00007fa957fb9000) libsvml.so => /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libsvml.so (0x00007fa9570ad000) libirng.so => /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libirng.so (0x00007fa956d3b000) libintlc.so.5 => /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libintlc.so.5 (0x00007fa956acf000) /lib64/ld-linux-x86-64.so.2 (0x00007fa959ab9000)' I can't provide a core file, because I can't compile or launch any program with mpifort... I've always the error 'core dumped' also when I try to compile a program with mpifort, and of course there isn't any core file. Giacomo Rossi Ph.D., Space Engineer Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com Member of Fortran-FOSS-programmers 2016-05-05 8:50 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: I’ve installed the latest version of Intel Parallel Studio (16.0.3), then I’ve downloaded the latest version of openmpi (1.10.2) and I’ve compiled it with `./configure CC=icc CXX=icpc F77=ifort FC=ifort --prefix=/opt/openmpi/1.10.2/intel/16.0.3` then I've installed and everything seems ok, but when I try the simple command ' /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v' I receive the following error 'Segmentation fault (core dumped)' I'm on ArchLinux, with kernel 4.5.1-1-ARCH; I've attache to this email the config.log file compressed with bzip2. Any help will be appreciated! Giacomo Rossi Ph.D., Space Engineer Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com Member of Fortran-FOSS-programmers _______________________________________________ users mailing list us...@open-mpi.org Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2016/05/29108.php
-- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users