On Sat, 08 Feb 2025 00:33:37 PST (-0800), Nelson Chu wrote:
I got an request about the undefined behaviors, considering the following case,

$ cat test.c
void main ()
{
  foo();
}
$ cat lib.h
void foo(void);
$ riscv64-unknown-linux-gnu-gcc test.c
riscv64-unknown-linux-gnu/bin/ld: /tmp/ccRO8fJl.o: in function `main':
test.c:(.text+0x8): undefined reference to `foo'
collect2: error: ld returned 1 exit status
$ riscv64-unknown-linux-gnu-gcc test.c 
-Wl,--unresolved-symbols=ignore-in-object-files
$ qemu-riscv64 a.out
Segmentation fault (core dumped)

Testing with x86 and aarch64, they won't get the segfault since they go plt
for the undefined foo symbol.  So, after applying this patch, I can get the
following too,

$ qemu-riscv64 a.out
a.out: symbol lookup error: a.out: undefined symbol: foo

The change of this patch should only affect the call behavior, which refer
to an undefined (weak) symbol, when building an dynamic executable.  I think
the pic/pie behavior won't be affected as usual.  I am not sure if the change
will cause trouble or not for other projects, so please feels free to cc people
that you think they will be affected, thanks.

Thanks for doing this. For some more context, there's a handful of Go plugins that seem to want `-Wl,--export-dynamic -Wl,--unresolved-symbols=ignore-in-object-files` to result in executables that have PLT-indirect calls to these undefined symbols, which they'll then later LD_PRELOAD or dlopen() to resolve. Here's an example <https://github.com/NVIDIA/nvidia-container-toolkit/blob/b19f5d8f7d76f8ec05534aadfc0c3641bd281d55/internal/dxcore/dxcore.go#L20>.

IMO that's pretty far into the realm of undefined behavior, and from poking around a bit it looks like that's the case even on x86 -- basically if my symbol gets linked before something has triggered a PLT to be created, then I end up with a direct reference that isn't dynamically resolvable.

It also looks like arm64 generates NOPs (rather than calls to absolute 0) on these undefined symbols, so it's possible some instances of these are just crashing lazily. THere might be some context floating around in 7cd2917227 ("aarch64: Return an error on conditional branch to an undefined symbol"), it's a bit hard to follow so I'm not sure if that's an intentional side-effect or just the easiest arbitrary thing to generate for this flavor of undefined behavior.

So I'm kind of split on what we should do here: in general I like to have undefined behavior crash eagerly, as otherwise we're just making these issues harder to debug. That said, this has blown up internally and making it crash lazily will make the fire go out, and it'd be really nice to start a Monday morning with more fires going out than starting...

Maybe there's some more context floating around in someone's brain about this?

---
 bfd/elfnn-riscv.c | 84 +++++++++++++++++++++++++----------------------
 1 file changed, 44 insertions(+), 40 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 57ced95fdb3..bca3a585f56 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -2263,6 +2263,7 @@ riscv_elf_relocate_section (bfd *output_bfd,
       reloc_howto_type *howto = riscv_elf_rtype_to_howto (input_bfd, r_type);
       const char *msg = NULL;
       bool resolved_to_zero;
+      bool via_plt = false;

       if (howto == NULL)
        continue;
@@ -2565,6 +2566,12 @@ riscv_elf_relocate_section (bfd *output_bfd,
       resolved_to_zero = (h != NULL
                          && UNDEFWEAK_NO_DYNAMIC_RELOC (info, h));

+      /* Refer to the PLT entry.  This check has to match the check in
+        _bfd_riscv_relax_section.  */
+      via_plt = (htab->elf.splt != NULL
+                && h != NULL
+                && h->plt.offset != MINUS_ONE);
+
       switch (r_type)
        {
        case R_RISCV_NONE:
@@ -2776,8 +2783,7 @@ riscv_elf_relocate_section (bfd *output_bfd,
        case R_RISCV_CALL_PLT:
          /* Handle a call to an undefined weak function.  This won't be
             relaxed, so we have to handle it here.  */
-         if (h != NULL && h->root.type == bfd_link_hash_undefweak
-             && (!bfd_link_pic (info) || h->plt.offset == MINUS_ONE))
+         if (h->root.type == bfd_link_hash_undefweak && !via_plt)
            {
              /* We can use x0 as the base register.  */
              bfd_vma insn = bfd_getl32 (contents + rel->r_offset + 4);
@@ -2791,42 +2797,40 @@ riscv_elf_relocate_section (bfd *output_bfd,

        case R_RISCV_JAL:
        case R_RISCV_RVC_JUMP:
-         if (bfd_link_pic (info) && h != NULL)
+         if (via_plt)
            {
-             if (h->plt.offset != MINUS_ONE)
-               {
-                 /* Refer to the PLT entry.  This check has to match the
-                    check in _bfd_riscv_relax_section.  */
-                 relocation = sec_addr (htab->elf.splt) + h->plt.offset;
-                 unresolved_reloc = false;
-               }
-             else if (!SYMBOL_REFERENCES_LOCAL (info, h)
-                      && (input_section->flags & SEC_ALLOC) != 0
-                      && (input_section->flags & SEC_READONLY) != 0
-                      && ELF_ST_VISIBILITY (h->other) == STV_DEFAULT)
-               {
-                 /* PR 28509, when generating the shared object, these
-                    referenced symbols may bind externally, which means
-                    they will be exported to the dynamic symbol table,
-                    and are preemptible by default.  These symbols cannot
-                    be referenced by the non-pic relocations, like
-                    R_RISCV_JAL and R_RISCV_RVC_JUMP relocations.
-
-                    However, consider that linker may relax the R_RISCV_CALL
-                    relocations to R_RISCV_JAL or R_RISCV_RVC_JUMP, if
-                    these relocations are relocated to the plt entries,
-                    then we won't report error for them.
-
-                    Perhaps we also need the similar checks for the
-                    R_RISCV_BRANCH and R_RISCV_RVC_BRANCH relocations.  */
-                 msg = bfd_asprintf (_("%%X%%P: relocation %s against `%s'"
-                                       " which may bind externally"
-                                       " can not be used"
-                                       " when making a shared object;"
-                                       " recompile with -fPIC\n"),
-                                     howto->name, h->root.root.string);
-                 r = bfd_reloc_notsupported;
-               }
+             relocation = sec_addr (htab->elf.splt) + h->plt.offset;
+             unresolved_reloc = false;
+           }
+         else if (bfd_link_pic (info)
+                  && h != NULL
+                  && h->plt.offset == MINUS_ONE
+                  && !SYMBOL_REFERENCES_LOCAL (info, h)
+                  && (input_section->flags & SEC_ALLOC) != 0
+                  && (input_section->flags & SEC_READONLY) != 0
+                  && ELF_ST_VISIBILITY (h->other) == STV_DEFAULT)
+           {
+             /* PR 28509, when generating the shared object, these
+                referenced symbols may bind externally, which means
+                they will be exported to the dynamic symbol table,
+                and are preemptible by default.  These symbols cannot
+                be referenced by the non-pic relocations, like
+                R_RISCV_JAL and R_RISCV_RVC_JUMP relocations.
+
+                However, consider that linker may relax the R_RISCV_CALL
+                relocations to R_RISCV_JAL or R_RISCV_RVC_JUMP, if
+                these relocations are relocated to the plt entries,
+                then we won't report error for them.
+
+                Perhaps we also need the similar checks for the
+                R_RISCV_BRANCH and R_RISCV_RVC_BRANCH relocations.  */
+             msg = bfd_asprintf (_("%%X%%P: relocation %s against `%s'"
+                                   " which may bind externally"
+                                   " can not be used"
+                                   " when making a shared object;"
+                                   " recompile with -fPIC\n"),
+                                 howto->name, h->root.root.string);
+             r = bfd_reloc_notsupported;
            }
          break;

@@ -5365,9 +5369,9 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
              undefined_weak = true;
            }

-         /* This line has to match the check in riscv_elf_relocate_section
-            in the R_RISCV_CALL[_PLT] case.  */
-         if (bfd_link_pic (info) && h->plt.offset != MINUS_ONE)
+         /* This line has to match the via_pltcheck in
+            riscv_elf_relocate_section in the R_RISCV_CALL[_PLT] case.  */
+         if (h->plt.offset != MINUS_ONE)
            {
              sym_sec = htab->elf.splt;
              symval = h->plt.offset;

Reply via email to