On Tue, Nov 03, 2020 at 01:09:43PM -0800, H.J. Lu via Gcc-patches wrote:
> On Tue, Nov 3, 2020 at 1:00 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > On Tue, Nov 3, 2020 at 12:46 PM Jozef Lawrynowicz
> > <joze...@mittosystems.com> wrote:
> > >
> > > On Tue, Nov 03, 2020 at 11:58:04AM -0800, H.J. Lu via Gcc-patches wrote:
> > > > On Tue, Nov 3, 2020 at 10:22 AM Jozef Lawrynowicz
> > > > <joze...@mittosystems.com> wrote:
> > > > >
> > > > > On Tue, Nov 03, 2020 at 09:57:58AM -0800, H.J. Lu via Gcc-patches 
> > > > > wrote:
> > > > > > On Tue, Nov 3, 2020 at 9:41 AM Jozef Lawrynowicz
> > > > > > <joze...@mittosystems.com> wrote:
> > > > > > >
> > > > > > > The attached patch implements TARGET_ASM_MARK_DECL_PRESERVED for 
> > > > > > > ELF GNU
> > > > > > > OSABI targets, so that declarations that have the "used" attribute
> > > > > > > applied will be saved from linker garbage collection.
> > > > > > >
> > > > > > > TARGET_ASM_MARK_DECL_PRESERVED will emit an assembler ".retain"
> > > > > >
> > > > > > Can you use the "R" flag instead?
> > > > > >
> > > > >
> > > > > For the benefit of this mailing list, I have copied my response from 
> > > > > the
> > > > > Binutils mailing list regarding this.
> > > > > The "comm_section" example I gave is actually innacurate, but you can
> > > > > see the examples of the variety of sections that would need to be
> > > > > handled by doing
> > > > >
> > > > > $ git grep -A2 "define.*SECTION_ASM_OP" gcc/ | grep "\".*\."
> > > > >
> > > > > > ... snip ...
> > > > > > Secondly, for seamless integration with the "used" attribute, we 
> > > > > > must be
> > > > > > able to to mark the symbol with the used attribute applied as 
> > > > > > "retained"
> > > > > > without changing its section name. For GCC "named" sections, this is
> > > > > > straightforward, but for "unnamed" sections it is a giant mess.
> > > > > >
> > > > > > The section name for a GCC "unnamed" section is not readily 
> > > > > > available,
> > > > > > instead a string which contains the full assembly code to switch to 
> > > > > > one
> > > > > > of these text/data/bss/rodata/comm etc. sections is encoded in the
> > > > > > structure.
> > > > > >
> > > > > > Backends define the assembly code to switch to these sections (some
> > > > > > "*ASM_OP*" macro) in a variety of ways. For example, the unnamed 
> > > > > > section
> > > > > > "comm_section", might correspond to a .bss section, or emit a .comm
> > > > > > directive. I even looked at trying to parse them to extract what the
> > > > > > name of a section will be, but it would be very messy and not 
> > > > > > robust.
> > > > > >
> > > > > > Meanwhile, having a .retain <symbol_name> directive is a very 
> > > > > > simmple
> > > > > > solution, and keeps the GCC implementation really concise (patch
> > > > > > attached). The assembler will know for sure what the section 
> > > > > > containing
> > > > > > the symbol will be, and can apply the SHF_GNU_RETAIN flag directly.
> > > > > >
> > > >
> > > > Please take a look at
> > > >
> > > > https://gitlab.com/x86-gcc/gcc/-/commits/users/hjl/elf/shf_retain
> > > >
> > > > which is built in top of
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2020-February/539963.html
> > > >
> > > > I think SECTION2_RETAIN matches SHF_GNU_RETAIN well.  If you
> > > > want, you extract my flags2 change and use it for SHF_GNU_RETAIN.
> > >
> > > In your patch you have to make the assumption that data_section, always
> > > corresponds to a section named .data. For just this example, c6x (which
> > > supports the GNU ELF OSABI) does not fit the rule:
> > >
> > > > c6x/elf-common.h:#define DATA_SECTION_ASM_OP 
> > > > "\t.section\t\".fardata\",\"aw\""
> > >
> > > data_section for c6x corresponds to .fardata, not .data. So the use of
> > > "used" on a data declaration would place it in a different section, that
> > > if the "used" attribute was not applied.
> > >
> > > For c6x and mips, readonly_data_section does not correspond to .rodata,
> > > so that assumption cannot be made either:
> > > > c6x/elf-common.h:#define READONLY_DATA_SECTION_ASM_OP 
> > > > "\t.section\t\".const\",\"a\",@progbits"
> > > > mips/mips.h:#define READONLY_DATA_SECTION_ASM_OP        "\t.rdata"      
> > > > /* read-only data */
> > >
> > > The same can be said for bss_section for c6x as well.
> >
> > Just add and use named_xxx_section.
> >

I guess new macros for targets that use non-standard names in unnamed
sections could work.

  c6x/elf-common.h:
    #define READONLY_DATA_SECTION_NAME ".const"
    #define READONLY_DATA_SECTION_ASM_OP 
"\t.section\t\".const\",\"a\",@progbits"

> > > Furthermore, this is only considering the examples in
> > > default_elf_select_section - the less standard unnamed section are used
> > > in many backend's implementation of select_section, and we would need to
> > > work out what section name they correspond to to properly support
> > > SHF_GNU_RETAIN.
> > >
> > > For every unnamed section, you either have to assume what the
> > > corresponding section name is, or parse the associated assembly output
> > > string for the section.
> >
> > My change is just  an example to show how it can be done, not a complete 
> > one.
> >
> > > Given these edge cases which must be handled for GCC to robustly emit
> > > the "R" flag for sections containing "used" symbols, surely it is
> > > preferable to leverage the existing TARGET_ASM_MARK_DECL_PRESERVED and
> > > emit a .retain <symname> directive, which is extremely simple and
> > > doesn't require any handling of these edge cases and non-standard
> > > backend implementations.
> >
> > It is used to update the symbol table.  Other usage is abuse.
> >

We can't update the symbol table because there is no room left in ELF
for new symbol flags. But the least we can do is convey the requirement
for a *symbol* to be retained from the compiler to the assembler. How
the assembler communicates to the linker that a symbol or section must
be retained is between those two programs and the object file format.

In this case we must use the section flag SHF_GNU_RETAIN. It is not
ideal, but this appears to be the best vehicle for communicating the
requirement to retain a symbol from the assembler to the linker.

Theoretically if we could set an ELF symbol flag for a symbol being
"retained", then we would use a ".retain" directive. The fact we
can't actually communicate the requirement for a symbol to be retained
from the assembler to the linker, because of limitations of the object
file format, is not related to the output of the compiler (IMO).

The following also seems like abuse to me, we can't change set/unset
other flags between .section directives for the same section, but we can
for SHF_GNU_RETAIN?

    .text
  main:
    ...
    .section .text,"axR"
  retained_fn:
    ...
    .section .text,"ax"
  do_stuff:
    ...

There would be further debate about whether the assembler should create
separate sections for the about .text, or merge them into one with
logical OR of the flags. I never did make my mind up about which is
best. Merging is closest to what would happen if we could communicate
"retain" as a property of the symbol.

> 
> For
> 
> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> static int xyzzy __attribute__((__used__)) = 1;
> int foo[0x40000];
> [hjl@gnu-cfl-2 gcc]$
> 
> foo should be removed by ld --gc-sections if it is unreferenced.  But your
> patch makes it impossible.

The same could be said if:
  static int xyzzy = 1;
  int foo[0x40000];

  int
  main (void)
  {
    while (xxyzzy);
  }

foo would be unnecessarily retained. If the user wants optimal operation
of garbage collection, they should use -ffunction/data-sections, the
"section" attribute on xyyzzy, or place foo in another file.

On a related note, in relocatable output from the linker, unique section
names .text.XXXX appear to normally be unhandled in linker scripts for
ld -r, so they will not get merged into an output section, and can still
be properly garbage collected when linking the final executable file.

Thanks,
Jozef

> 
> -- 
> H.J.

Reply via email to