Looks like we could give it a try and see if it works and won't affect
current projects.  I will commit it before this weekend if there are no
objections.

Thanks
Nelson

On Tue, Feb 11, 2025 at 1:53 AM Palmer Dabbelt <pal...@rivosinc.com> wrote:

> On Sat, 08 Feb 2025 00:33:37 PST (-0800), Nelson Chu wrote:
> > I got an request about the undefined behaviors, considering the
> following case,
> >
> > $ cat test.c
> > void main ()
> > {
> >   foo();
> > }
> > $ cat lib.h
> > void foo(void);
> > $ riscv64-unknown-linux-gnu-gcc test.c
> > riscv64-unknown-linux-gnu/bin/ld: /tmp/ccRO8fJl.o: in function `main':
> > test.c:(.text+0x8): undefined reference to `foo'
> > collect2: error: ld returned 1 exit status
> > $ riscv64-unknown-linux-gnu-gcc test.c
> -Wl,--unresolved-symbols=ignore-in-object-files
> > $ qemu-riscv64 a.out
> > Segmentation fault (core dumped)
> >
> > Testing with x86 and aarch64, they won't get the segfault since they go
> plt
> > for the undefined foo symbol.  So, after applying this patch, I can get
> the
> > following too,
> >
> > $ qemu-riscv64 a.out
> > a.out: symbol lookup error: a.out: undefined symbol: foo
> >
> > The change of this patch should only affect the call behavior, which
> refer
> > to an undefined (weak) symbol, when building an dynamic executable.  I
> think
> > the pic/pie behavior won't be affected as usual.  I am not sure if the
> change
> > will cause trouble or not for other projects, so please feels free to cc
> people
> > that you think they will be affected, thanks.
>
> Thanks for doing this.  For some more context, there's a handful of Go
> plugins that seem to want `-Wl,--export-dynamic
> -Wl,--unresolved-symbols=ignore-in-object-files` to result in
> executables that have PLT-indirect calls to these undefined symbols,
> which they'll then later LD_PRELOAD or dlopen() to resolve.  Here's an
> example
> <
> https://github.com/NVIDIA/nvidia-container-toolkit/blob/b19f5d8f7d76f8ec05534aadfc0c3641bd281d55/internal/dxcore/dxcore.go#L20
> >.
>
> IMO that's pretty far into the realm of undefined behavior, and from
> poking around a bit it looks like that's the case even on x86 --
> basically if my symbol gets linked before something has triggered a PLT
> to be created, then I end up with a direct reference that isn't
> dynamically resolvable.
>
> It also looks like arm64 generates NOPs (rather than calls to absolute
> 0) on these undefined symbols, so it's possible some instances of these
> are just crashing lazily.  THere might be some context floating around
> in 7cd2917227 ("aarch64: Return an error on conditional branch to an
> undefined symbol"), it's a bit hard to follow so I'm not sure if that's
> an intentional side-effect or just the easiest arbitrary thing to
> generate for this flavor of undefined behavior.
>
> So I'm kind of split on what we should do here: in general I like to
> have undefined behavior crash eagerly, as otherwise we're just making
> these issues harder to debug.  That said, this has blown up internally
> and making it crash lazily will make the fire go out, and it'd be really
> nice to start a Monday morning with more fires going out than
> starting...
>
> Maybe there's some more context floating around in someone's brain about
> this?
>
> > ---
> >  bfd/elfnn-riscv.c | 84 +++++++++++++++++++++++++----------------------
> >  1 file changed, 44 insertions(+), 40 deletions(-)
> >
> > diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
> > index 57ced95fdb3..bca3a585f56 100644
> > --- a/bfd/elfnn-riscv.c
> > +++ b/bfd/elfnn-riscv.c
> > @@ -2263,6 +2263,7 @@ riscv_elf_relocate_section (bfd *output_bfd,
> >        reloc_howto_type *howto = riscv_elf_rtype_to_howto (input_bfd,
> r_type);
> >        const char *msg = NULL;
> >        bool resolved_to_zero;
> > +      bool via_plt = false;
> >
> >        if (howto == NULL)
> >       continue;
> > @@ -2565,6 +2566,12 @@ riscv_elf_relocate_section (bfd *output_bfd,
> >        resolved_to_zero = (h != NULL
> >                         && UNDEFWEAK_NO_DYNAMIC_RELOC (info, h));
> >
> > +      /* Refer to the PLT entry.  This check has to match the check in
> > +      _bfd_riscv_relax_section.  */
> > +      via_plt = (htab->elf.splt != NULL
> > +              && h != NULL
> > +              && h->plt.offset != MINUS_ONE);
> > +
> >        switch (r_type)
> >       {
> >       case R_RISCV_NONE:
> > @@ -2776,8 +2783,7 @@ riscv_elf_relocate_section (bfd *output_bfd,
> >       case R_RISCV_CALL_PLT:
> >         /* Handle a call to an undefined weak function.  This won't be
> >            relaxed, so we have to handle it here.  */
> > -       if (h != NULL && h->root.type == bfd_link_hash_undefweak
> > -           && (!bfd_link_pic (info) || h->plt.offset == MINUS_ONE))
> > +       if (h->root.type == bfd_link_hash_undefweak && !via_plt)
> >           {
> >             /* We can use x0 as the base register.  */
> >             bfd_vma insn = bfd_getl32 (contents + rel->r_offset + 4);
> > @@ -2791,42 +2797,40 @@ riscv_elf_relocate_section (bfd *output_bfd,
> >
> >       case R_RISCV_JAL:
> >       case R_RISCV_RVC_JUMP:
> > -       if (bfd_link_pic (info) && h != NULL)
> > +       if (via_plt)
> >           {
> > -           if (h->plt.offset != MINUS_ONE)
> > -             {
> > -               /* Refer to the PLT entry.  This check has to match the
> > -                  check in _bfd_riscv_relax_section.  */
> > -               relocation = sec_addr (htab->elf.splt) + h->plt.offset;
> > -               unresolved_reloc = false;
> > -             }
> > -           else if (!SYMBOL_REFERENCES_LOCAL (info, h)
> > -                    && (input_section->flags & SEC_ALLOC) != 0
> > -                    && (input_section->flags & SEC_READONLY) != 0
> > -                    && ELF_ST_VISIBILITY (h->other) == STV_DEFAULT)
> > -             {
> > -               /* PR 28509, when generating the shared object, these
> > -                  referenced symbols may bind externally, which means
> > -                  they will be exported to the dynamic symbol table,
> > -                  and are preemptible by default.  These symbols cannot
> > -                  be referenced by the non-pic relocations, like
> > -                  R_RISCV_JAL and R_RISCV_RVC_JUMP relocations.
> > -
> > -                  However, consider that linker may relax the
> R_RISCV_CALL
> > -                  relocations to R_RISCV_JAL or R_RISCV_RVC_JUMP, if
> > -                  these relocations are relocated to the plt entries,
> > -                  then we won't report error for them.
> > -
> > -                  Perhaps we also need the similar checks for the
> > -                  R_RISCV_BRANCH and R_RISCV_RVC_BRANCH relocations.  */
> > -               msg = bfd_asprintf (_("%%X%%P: relocation %s against
> `%s'"
> > -                                     " which may bind externally"
> > -                                     " can not be used"
> > -                                     " when making a shared object;"
> > -                                     " recompile with -fPIC\n"),
> > -                                   howto->name, h->root.root.string);
> > -               r = bfd_reloc_notsupported;
> > -             }
> > +           relocation = sec_addr (htab->elf.splt) + h->plt.offset;
> > +           unresolved_reloc = false;
> > +         }
> > +       else if (bfd_link_pic (info)
> > +                && h != NULL
> > +                && h->plt.offset == MINUS_ONE
> > +                && !SYMBOL_REFERENCES_LOCAL (info, h)
> > +                && (input_section->flags & SEC_ALLOC) != 0
> > +                && (input_section->flags & SEC_READONLY) != 0
> > +                && ELF_ST_VISIBILITY (h->other) == STV_DEFAULT)
> > +         {
> > +           /* PR 28509, when generating the shared object, these
> > +              referenced symbols may bind externally, which means
> > +              they will be exported to the dynamic symbol table,
> > +              and are preemptible by default.  These symbols cannot
> > +              be referenced by the non-pic relocations, like
> > +              R_RISCV_JAL and R_RISCV_RVC_JUMP relocations.
> > +
> > +              However, consider that linker may relax the R_RISCV_CALL
> > +              relocations to R_RISCV_JAL or R_RISCV_RVC_JUMP, if
> > +              these relocations are relocated to the plt entries,
> > +              then we won't report error for them.
> > +
> > +              Perhaps we also need the similar checks for the
> > +              R_RISCV_BRANCH and R_RISCV_RVC_BRANCH relocations.  */
> > +           msg = bfd_asprintf (_("%%X%%P: relocation %s against `%s'"
> > +                                 " which may bind externally"
> > +                                 " can not be used"
> > +                                 " when making a shared object;"
> > +                                 " recompile with -fPIC\n"),
> > +                               howto->name, h->root.root.string);
> > +           r = bfd_reloc_notsupported;
> >           }
> >         break;
> >
> > @@ -5365,9 +5369,9 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
> >             undefined_weak = true;
> >           }
> >
> > -       /* This line has to match the check in riscv_elf_relocate_section
> > -          in the R_RISCV_CALL[_PLT] case.  */
> > -       if (bfd_link_pic (info) && h->plt.offset != MINUS_ONE)
> > +       /* This line has to match the via_pltcheck in
> > +          riscv_elf_relocate_section in the R_RISCV_CALL[_PLT] case.  */
> > +       if (h->plt.offset != MINUS_ONE)
> >           {
> >             sym_sec = htab->elf.splt;
> >             symval = h->plt.offset;
>

Reply via email to