https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66640
Bug ID: 66640 Summary: Symbolic (addr2line) backtrace handler sometimes does not terminate when using OpenMP Product: gcc Version: 5.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: bugs at stellardeath dot org Target Milestone: --- Symbolic backtraces seem to be implemented by a fork()/execve() to addr2line. when this is done from within an OpenMP parallel region, the fork()ed addr2line somehow never terminates and the program hangs forever in the backtrace. Small example program that triggers a divide-by-zero: ####################################################### program test use, intrinsic :: iso_c_binding real(kind=C_DOUBLE) :: a integer i !$omp parallel private(a) a = 2.0_C_DOUBLE do i = 2, 0, -1 a = a / i end do write(*,*) a !$omp end parallel end program ####################################################### Compile with gfortran -g -fopenmp test.F90 -ffpe-trap=zero With one thread it produces a backtrace and terminates, as expected: ####################################################### $> OMP_NUM_THREADS=1 ./test Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. Backtrace for this error: #0 0x7F76637514B7 #1 0x7F76637506B0 #2 0x7F766282D43F #3 0x400A6D in MAIN__._omp_fn.0 at test.F90:9 #4 0x400985 in test at test.F90:6 [1] 17635 floating point exception OMP_NUM_THREADS=1 ./test $> ####################################################### While with more than one thread it _sometimes_ does not terminate (here enforced by calling it as often as it takes in the "while true" loop): ####################################################### $> while true; do clear; OMP_NUM_THREADS=2 ./test; done ^L Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. Backtrace for this error: Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. Backtrace for this error: #0 0x7F7B3D2834B7 #1 0x7F7B3D2826B0 #2 0x7F7B3C35F43F #0 0x7F7B3D2834B7 #1 0x7F7B3D2826B0 #3 0x400A6D in MAIN__._omp_fn.0 at test.F90:9 #2 0x7F7B3C35F43F #4 0x7F7B3CD4FFAD #5 0x7F7B3C6D4483 #6 0x7F7B3C412A4C #3 0x400A6D in MAIN__._omp_fn.0 at test.F90:9 #7 0xFFFFFFFFFFFFFFFF #4 0x400985 in test at test.F90:6 [Hangs here] ####################################################### This might also be interesting: ####################################################### $> ps uf | grep -n "addr2line\|test" 12:lorenz 18346 0.0 0.0 29532 1392 pts/4 Sl+ 15:10 0:00 \_ ./test 13:lorenz 18348 0.0 0.0 13832 2000 pts/4 S+ 15:10 0:00 \_ /usr/bin/addr2line -e /home/lorenz/dev/addr2line_bug/test -f -s -C 14:lorenz 18349 0.0 0.0 13832 2048 pts/4 S+ 15:10 0:00 \_ /usr/bin/addr2line -e /home/lorenz/dev/addr2line_bug/test -f -s -C $> $> gdb -p 18348 -ex bt -ex detach -ex q 2>/dev/null | tail -n 13 #0 0x00007f04bb171580 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f04bb109f00 in _IO_new_file_underflow (fp=0x7f04bb4324c0 <_IO_2_1_stdin_>) at fileops.c:580 #2 0x00007f04bb10ad6e in __GI__IO_default_uflow (fp=0x7f04bb4324c0 <_IO_2_1_stdin_>) at genops.c:426 #3 0x00007f04bb0ffc94 in __GI__IO_getline_info (fp=fp@entry=0x7f04bb4324c0 <_IO_2_1_stdin_>, buf=buf@entry=0x7ffdd8139e60 'F' <repeats 16 times>, "\n", n=99, delim=delim@entry=10, extract_delim=extract_delim@entry=1, eof=eof@entry=0x0) at iogetline.c:69 #4 0x00007f04bb0ffd88 in __GI__IO_getline (fp=fp@entry=0x7f04bb4324c0 <_IO_2_1_stdin_>, buf=buf@entry=0x7ffdd8139e60 'F' <repeats 16 times>, "\n", n=<optimized out>, delim=delim@entry=10, extract_delim=extract_delim@entry=1) at iogetline.c:38 #5 0x00007f04bb0fec34 in _IO_fgets (buf=0x7ffdd8139e60 'F' <repeats 16 times>, "\n", n=<optimized out>, fp=0x7f04bb4324c0 <_IO_2_1_stdin_>) at iofgets.c:56 #6 0x000000000040230b in ?? () #7 0x00007f04bb0b78c5 in __libc_start_main (main=0x401fc0, argc=6, argv=0x7ffdd8139fe8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffdd8139fd8) at libc-start.c:289 #8 0x0000000000402855 in ?? () Detaching from program: /usr/bin/addr2line, process 18348 $> $> gdb -p 18346 -ex bt -ex "info threads" -ex detach -ex q 2>/dev/null | tail -n 15 0x00007fda31d43f44 in __libc_wait (stat_loc=0x0) at ../sysdeps/unix/sysv/linux/wait.c:35 #0 0x00007fda31d43f44 in __libc_wait (stat_loc=0x0) at ../sysdeps/unix/sysv/linux/wait.c:35 #1 0x00007fda328eb4e9 in _gfortrani_backtrace () at ../../../libgfortran/runtime/backtrace.c:263 #2 0x00007fda328ea6b1 in _gfortrani_backtrace_handler (signum=8) at ../../../libgfortran/runtime/compile_options.c:129 #3 <signal handler called> #4 0x0000000000400a6d in MAIN__._omp_fn.0 () at test.F90:9 #5 0x0000000000400986 in test () at test.F90:6 #6 0x00000000004009cb in main (argc=1, argv=0x7ffc70796e8f) at test.F90:15 #7 0x00007fda319b48c5 in __libc_start_main (main=0x40098d <main>, argc=1, argv=0x7ffc707969c8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc707969b8) at libc-start.c:289 #8 0x0000000000400899 in _start () at ../sysdeps/x86_64/start.S:118 Id Target Id Frame 2 Thread 0x7fda3178f700 (LWP 18347) "test" 0x00007fda31d43f44 in __libc_wait (stat_loc=0x0) at ../sysdeps/unix/sysv/linux/wait.c:35 * 1 Thread 0x7fda32dd8780 (LWP 18346) "test" 0x00007fda31d43f44 in __libc_wait (stat_loc=0x0) at ../sysdeps/unix/sysv/linux/wait.c:35 Detaching from program: /home/lorenz/dev/addr2line_bug/test, process 18346 $> ####################################################### I would guess there is some thread-unsafety in libgfortran/runtime/backtrace.c? Kind regards, Lorenz