[Bug fortran/96158] New: Symbols not emitted for module common variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158 Bug ID: 96158 Summary: Symbols not emitted for module common variables Product: gcc Version: 9.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: amelvill at umich dot edu Target Milestone: --- Created attachment 48859 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48859&action=edit A tarball archive demonstrating the problem I am trying to debug a Fortran program that uses common statements like this: module somemodule integer*8moduleVar !Comment out the line below to make "call moduleVar" start working again common /othermodule/ moduleVar end module ... but it seems like gfortran does not emit debugging symbols that GDB can use. See this github repo for a minimal example: https://github.com/amelvill-umich/Fortran_GDB_Common I have also attached a zip file with the sources (and a README that goes into more detail, with GDB commands and output), if you prefer. - With the common statement, I can access the variable in Fortran code, but GDB is not able to read its value. - Manually inspecting somemodule.o with a hex editor, I notice that the .debug_str section seems to have information that might identify the common variable, but it doesn't appear to be used I almost filed a bug report with GDB about this, until I noticed that gfortran emits only othermodule_ as a symbol for the othermodule common variable, which didn't seem right. It doesn't seem like GDB would be able to provide very useful information with only that as a symbol. But, if this is the expected behavior let me know if you have any suggestions for debugging variables like this (or if you think this is a bug/limitation in GDB) Additional information: $ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:hsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.3.0-10ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2) The Makefile in the tarball/repository did not have -Wall -Wextra, but adding these did not produce any additional information (no errors / warnings at all from gfortran) $ make gfortran -c somemodule.f90 -g -Wall -Wextra gfortran -c main.f90 -g -Wall -Wextra gfortran -o main main.o somemodule.o -g -Wall -Wextra I'm not aware of a way to reproduce this bug with a single file. I mentioned this on the gfortran mailing list and somebody recommended that I post this here. Thanks, - ajm
[Bug fortran/96158] Symbols not emitted for module common variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158 --- Comment #4 from AJM --- >> I won't comment on the questionable programming idiom of placing >> a common block in a module, which kind of defeats the niceties of >> a module. > If somebody wants to transition your code from using common blocks to > modules, that is a good way to proceed. When all the direct usage > of the common block have been removed, you can then remove the > COMMON statement from the module. This is the case, more or less. I didn't write the code that did this. I would be quite happy to see the common blocks get moved to a module, but to make things a bit more dangerous on that side, these variables are bound to a C variable. If I could find a way to move the common statements to a module that was guaranteed to have no issues whatsoever (UB, data alignment, etc., including on the C side), I would be happy to do it. However, I haven't found enough information on this to feel confident about it (there's significant gaps in the documentation I've found), so it seems irresponsible to risk introducing bugs for the sake of getting GDB working.
[Bug fortran/96158] Symbols not emitted for module common variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158 --- Comment #5 from AJM --- Also, in case it wasn't clear, > Breakpoint 2, boo () at a.f90:9 > 9 write(*, '(A, I3)') "moduleVar=", n > (gdb) p n > $2 = 123 > (gdb) p moduleVar > No symbol "moduleVar" in current context. > (gdb) p (integer*8)othermodule_ > $3 = 123 I did find this as a workaround (it's in the repo README.md https://github.com/amelvill-umich/Fortran_GDB_Common , if you haven't seen it), but this is not ideal, because there are hundreds of common statement variables in the module in question, of varying size and type. They're all reduced to a single othermodule_ variable. I hypothetically could find the address of each variable, offset, and cast, you're right, I could also hypothetically make some kind of memory map type that I could cast that variable to, but then that would be a giant amount of duplicated code and an extra thing to keep in sync whenever something is added. As far as workarounds go, if it came to that I'd rather just make a dummy "debug" function that stored these common variables as a local variable.
[Bug fortran/96158] Debug symbols not emitted for module common variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158 --- Comment #8 from AJM --- > > >> I won't comment on the questionable programming idiom of placing > > >> a common block in a module, which kind of defeats the niceties of > > >> a module. > > > If somebody wants to transition your code from using common blocks to > > > modules, that is a good way to proceed. When all the direct usage > > > of the common block have been removed, you can then remove the > > > COMMON statement from the module. > > > > This is the case, more or less. I didn't write the code that did this. > > > > I would be quite happy to see the common blocks get moved to a module, > > but to make things a bit more dangerous on that side, these variables > > are bound to a C variable. > Do you mean the variable is declared with BIND(C)? > If so, you should be getting an error. The bind statement would look like this, directly under the "common" statement, inside somemodule.f90: module somemodule integer*8moduleVar !then many more variables common /othermodule/ moduleVar !then many more variables bind(C, name="othermodule") :: /othermodule/ end module The program, with the bind statement added, compiles without errors. If you really need to know, on the C side there is a struct with fields that match the order and size of the variables in the common statement / module declaration. I am almost certain that this is not the "right way" to do this, and that there is some UB there in struct packing and compiler decisions on the size of the variables (neither the C side nor the fortran side has packing pragmas, and worse, the C side uses compiler defined integer types like "int"). Without a doubt this is not an ideal setup, a rewrite is probably in order to ensure portability. We don't have the time resources for that rewrite right now, though (and that's not my call, either). While I was making a minimal example I originally thought bind was the root cause, but I eliminated it as the source of the problem (I found that I could reproduce the issue with only common). I removed it to try and make the most minimal example possible. This C side explanation is, in my opinion, not relevant at all to the bug at hand and is our team's responsibility, not gfortran's. It's just to give you context as to what I'm working with here and why "just make a module" is not really an option, I would consider the C interfacing very precarious and I personally only have the expertise to make it portable from the C side, not the Fortran side (though I'm trying to get up to speed on the latter). Though, if anyone has any suggestions for documentation I can read about how to avoid problems between g++ and gfortan that might come about from moving this to a module I would appreciate it. I'm hoping that eventually I can come up with a path towards making this a module that would exactly replicate the original's intended behavior in a standards compliant way without breaking anything. That path is not clear, at the moment. > A quick scan of the DWARF5 standard does not show > a DW_TAG that applies to a variable declared in > a module. Referring to the github repo quickly https://github.com/amelvill-umich/Fortran_GDB_Common#with-that-line-commented-out , with the common statement commented out, $ nm somemodule.o B __somemodule_MOD_modulevar $ gdb main GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1 Copyright (C) 2020 Free Software Foundation, Inc. (...) Reading symbols from main... (gdb) b 9 Breakpoint 1 at 0x1238: file main.f90, line 9. (gdb) r Starting program: /home/me/a_dev/fortran_gdb_common/main moduleVar=123 Breakpoint 1, myprogram () at main.f90:9 9 end program (gdb) call moduleVar $1 = 123 It seems like gfortran emits symbols of the form ___MOD_ if the common statement is omitted, could the compiler emit something like this for common statements? Like ___MOD_? In my test program's case, could it emit __othermodule_MOD_moduleVar? Would there be issues with name collisions if there actually was a module named othermodule (in addition to common)? To be fair, I have not read the DWARF spec in depth, and "this is undefined behavior" is a fair answer.
[Bug fortran/96158] Debug symbols not emitted for module common variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158 --- Comment #11 from AJM --- Thanks for all your suggestions, they're very helpful! >'-falign-commons' > By default, 'gfortran' enforces **proper** alignment of all variables > in a 'COMMON' block by padding them as needed... I am not sure what "proper" means in this context, proper for which language, which architecture? All of them? What's "proper" for AMD64 isn't proper for i386, and what's proper for Fortran might not be proper for C, right? What about byte endianness? > My guess would be that a common block > is simply considered to be a chunk of memory under > DWARF5. > It provides a base address and one needs to > know how to compute offsets into that chunk (including > dealing with any padding). My tests seem to show that this is the case so far too. It seems a bit weird to me because GCC is able to handle things like individual int "extern" variables in C, which would seem to have a very similar issue (memory stored in a different object file, in an ad-hoc manner with no prescribed structure or packing as an interface). The only thing that I can think of that's different is the fact that gfortran generates a .mod file, which I'm guessing (?) is some kind of binary interface definition (like a precompiled header?); I'm really not sure what it does, but there is no .mod file generated for "othermodule", only for somemodule (which matches the file name). At one point I was just wondering if all I needed to do was generate a .mod file for othermodule somehow (the common 'module'), but I noticed that the symbols were stored directly in the .o file, and I started to wonder if the .mod file served any purpose at all. > the Fortran processor (in this case gfortran) > and its companion C processor (gcc) have the same conventions regarding > layout of variables etc. I was hoping this was the case, thanks for confirming that. > Maybe, if you show the code above to whoever's in charge, that might > cause that person to reconsider :-) We've discussed it at great length, these sorts of things have to happen in stages I think. I have informed them of the risks and downsides of not doing this now (I have been detailed and clear in my findings, including possible portability / reliability problems and UB problems, additional time spent doing maintenance, and now also debugging issues). They have decided to accept the risks and inefficiencies. Given the circumstances a rewrite is not on the table; the decision is final and my job now is to support them as best I can. It's hard to allocate limited resources, there is no lack of work here. That's about all I can say on that I guess. This is not the first time I have entered a job with a codebase that had UB, in fact, I would say I have never NOT had a problem with UB at any place I've ever worked. Personally, I work hard not to introduce more, fix it when I can, and try and help coders avoid adding more.