[Bug fortran/96158] New: Symbols not emitted for module common variables

2020-07-10 Thread amelvill at umich dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158

Bug ID: 96158
   Summary: Symbols not emitted for module common variables
   Product: gcc
   Version: 9.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amelvill at umich dot edu
  Target Milestone: ---

Created attachment 48859
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48859&action=edit
A tarball archive demonstrating the problem

I am trying to debug a Fortran program that uses common statements like this:

module somemodule

integer*8moduleVar

!Comment out the line below to make "call moduleVar" start working again
common /othermodule/  moduleVar

end module

... but it seems like gfortran does not emit debugging symbols that GDB can
use.

See this github repo for a minimal example: 
https://github.com/amelvill-umich/Fortran_GDB_Common

I have also attached a zip file with the sources (and a README that goes into
more detail, with GDB commands and output), if you prefer.
- With the common statement, I can access the variable in Fortran code, but GDB
is not able to read its value.
- Manually inspecting somemodule.o with a hex editor, I notice that the
.debug_str section seems to have information that might identify the common
variable, but it doesn't appear to be used

I almost filed a bug report with GDB about this, until I noticed that gfortran
emits only othermodule_ as a symbol for the othermodule common variable, which
didn't seem right. It doesn't seem like GDB would be able to provide very
useful information with only that as a symbol.

But, if this is the expected behavior let me know if you have any suggestions
for debugging variables like this (or if you think this is a bug/limitation in
GDB)

Additional information:

$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.3.0-10ubuntu2'
--with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-9
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-offload-targets=nvptx-none,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2) 

The Makefile in the tarball/repository did not have -Wall -Wextra, but adding
these did not produce any additional information (no errors / warnings at all
from gfortran)
$ make
gfortran -c somemodule.f90 -g -Wall -Wextra
gfortran -c main.f90 -g -Wall -Wextra
gfortran -o main main.o somemodule.o -g -Wall -Wextra

I'm not aware of a way to reproduce this bug with a single file.

I mentioned this on the gfortran mailing list and somebody recommended that I
post this here.

Thanks,

- ajm

[Bug fortran/96158] Symbols not emitted for module common variables

2020-07-13 Thread amelvill at umich dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158

--- Comment #4 from AJM  ---
>> I won't comment on the questionable programming idiom of placing
>> a common block in a module, which kind of defeats the niceties of
>> a module.
> If somebody wants to transition your code from using common blocks to
> modules, that is a good way to proceed.   When all the direct usage
> of the common block have been removed, you can then remove the
> COMMON statement from the module.

This is the case, more or less. I didn't write the code that did this.

I would be quite happy to see the common blocks get moved to a module, but to
make things a bit more dangerous on that side, these variables are bound to a C
variable.

If I could find a way to move the common statements to a module that was
guaranteed to have no issues whatsoever (UB, data alignment, etc., including on
the C side), I would be happy to do it. 

However, I haven't found enough information on this to feel confident about it
(there's significant gaps in the documentation I've found), so it seems
irresponsible to risk introducing bugs for the sake of getting GDB working.

[Bug fortran/96158] Symbols not emitted for module common variables

2020-07-13 Thread amelvill at umich dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158

--- Comment #5 from AJM  ---
Also, in case it wasn't clear,

> Breakpoint 2, boo () at a.f90:9
> 9  write(*, '(A, I3)') "moduleVar=", n
> (gdb) p n
> $2 = 123
> (gdb) p moduleVar
> No symbol "moduleVar" in current context.
> (gdb) p (integer*8)othermodule_
> $3 = 123

I did find this as a workaround (it's in the repo README.md 
https://github.com/amelvill-umich/Fortran_GDB_Common , if you haven't seen it),
but this is not ideal, because there are hundreds of common statement variables
in the module in question, of varying size and type. They're all reduced to a
single othermodule_ variable.

I hypothetically could find the address of each variable, offset, and cast,
you're right, I could also hypothetically make some kind of memory map type
that I could cast that variable to, but then that would be a giant amount of
duplicated code and an extra thing to keep in sync whenever something is added.

As far as workarounds go, if it came to that I'd rather just make a dummy
"debug" function that stored these common variables as a local variable.

[Bug fortran/96158] Debug symbols not emitted for module common variables

2020-07-13 Thread amelvill at umich dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158

--- Comment #8 from AJM  ---
> > >> I won't comment on the questionable programming idiom of placing
> > >> a common block in a module, which kind of defeats the niceties of
> > >> a module.
> > > If somebody wants to transition your code from using common blocks to
> > > modules, that is a good way to proceed.   When all the direct usage
> > > of the common block have been removed, you can then remove the
> > > COMMON statement from the module.
> > 
> > This is the case, more or less. I didn't write the code that did this.
> > 
> > I would be quite happy to see the common blocks get moved to a module,
> > but to make things a bit more dangerous on that side, these variables
> > are bound to a C variable.
> Do you mean the variable is declared with BIND(C)?
> If so, you should be getting an error.

The bind statement would look like this, directly under the "common" statement,
inside somemodule.f90:

module somemodule

integer*8moduleVar !then many more variables

common /othermodule/  moduleVar !then many more variables
bind(C, name="othermodule") :: /othermodule/

end module

The program, with the bind statement added, compiles without errors. 

If you really need to know, on the C side there is a struct with fields that
match the order and size of the variables in the common statement / module
declaration. I am almost certain that this is not the "right way" to do this,
and that there is some UB there in struct packing and compiler decisions on the
size of the variables (neither the C side nor the fortran side has packing
pragmas, and worse, the C side uses compiler defined integer types like "int").
Without a doubt this is not an ideal setup, a rewrite is probably in order to
ensure portability. We don't have the time resources for that rewrite right
now, though (and that's not my call, either).

While I was making a minimal example I originally thought bind was the root
cause, but I eliminated it as the source of the problem (I found that I could
reproduce the issue with only common). I removed it to try and make the most
minimal example possible. 

This C side explanation is, in my opinion, not relevant at all to the bug at
hand and is our team's responsibility, not gfortran's. It's just to give you
context as to what I'm working with here and why "just make a module" is not
really an option, I would consider the C interfacing very precarious and I
personally only have the expertise to make it portable from the C side, not the
Fortran side (though I'm trying to get up to speed on the latter). 

Though, if anyone has any suggestions for documentation I can read about how to
avoid problems between g++ and gfortan that might come about from moving this
to a module I would appreciate it. I'm hoping that eventually I can come up
with a path towards making this a module that would exactly replicate the
original's intended behavior in a standards compliant way without breaking
anything. That path is not clear, at the moment.

> A quick scan of the DWARF5 standard does not show
> a DW_TAG that applies to a variable declared in 
> a module.

Referring to the github repo quickly
https://github.com/amelvill-umich/Fortran_GDB_Common#with-that-line-commented-out
, with the common statement commented out,

$ nm somemodule.o
 B __somemodule_MOD_modulevar

$ gdb main
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
(...)
Reading symbols from main...
(gdb) b 9
Breakpoint 1 at 0x1238: file main.f90, line 9.
(gdb) r
Starting program: /home/me/a_dev/fortran_gdb_common/main 
moduleVar=123

Breakpoint 1, myprogram () at main.f90:9
9   end program
(gdb) call moduleVar
$1 = 123

It seems like gfortran emits symbols of the form ___MOD_
if the common statement is omitted, could the compiler emit something like this
for common statements? Like ___MOD_?

In my test program's case, could it emit __othermodule_MOD_moduleVar? Would
there be issues with name collisions if there actually was a module named
othermodule (in addition to common)?

To be fair, I have not read the DWARF spec in depth, and "this is undefined
behavior" is a fair answer.

[Bug fortran/96158] Debug symbols not emitted for module common variables

2020-07-13 Thread amelvill at umich dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96158

--- Comment #11 from AJM  ---
Thanks for all your suggestions, they're very helpful!

>'-falign-commons'
> By default, 'gfortran' enforces **proper** alignment of all variables
> in a 'COMMON' block by padding them as needed...

I am not sure what "proper" means in this context, proper for which language,
which architecture? All of them? What's "proper" for AMD64 isn't proper for
i386, and what's proper for Fortran might not be proper for C, right? What
about byte endianness?

> My guess would be that a common block
> is simply considered to be a chunk of memory under
> DWARF5.
> It provides a base address and one needs to
> know how to compute offsets into that chunk (including
> dealing with any padding).

My tests seem to show that this is the case so far too. It seems a bit weird to
me because GCC is able to handle things like individual int "extern" variables
in C, which would seem to have a very similar issue (memory stored in a
different object file, in an ad-hoc manner with no prescribed structure or
packing as an interface).

The only thing that I can think of that's different is the fact that gfortran
generates a .mod file, which I'm guessing (?) is some kind of binary interface
definition (like a precompiled header?); I'm really not sure what it does, but
there is no .mod file generated for "othermodule", only for somemodule (which
matches the file name).

At one point I was just wondering if all I needed to do was generate a .mod
file for othermodule somehow (the common 'module'), but I noticed that the
symbols were stored directly in the .o file, and I started to wonder if the
.mod file served any purpose at all.

>  the Fortran processor (in this case gfortran)
> and its companion C processor (gcc) have the same conventions regarding
> layout of variables etc.

I was hoping this was the case, thanks for confirming that.

> Maybe, if you show the code above to whoever's in charge, that might
> cause that person to reconsider :-)

We've discussed it at great length, these sorts of things have to happen in
stages I think. 

I have informed them of the risks and downsides of not doing this now (I have
been detailed and clear in my findings, including possible portability /
reliability problems and UB problems, additional time spent doing maintenance,
and now also debugging issues). They have decided to accept the risks and
inefficiencies.

Given the circumstances a rewrite is not on the table; the decision is final
and my job now is to support them as best I can. It's hard to allocate limited
resources, there is no lack of work here. That's about all I can say on that I
guess. This is not the first time I have entered a job with a codebase that had
UB, in fact, I would say I have never NOT had a problem with UB at any place
I've ever worked. Personally, I work hard not to introduce more, fix it when I
can, and try and help coders avoid adding more.