Ahmad Khalifa <ah...@khalifa.ws> writes:
> On 28/01/2025 22:28, Sam Hartman wrote:
>>>>>>> "Russ" == Russ Allbery <r...@debian.org> writes:

>>      Russ> recollection (it's been a *lot* of years so I'm hoping I'm
>>      Russ> getting this right) is that this interfered with proper
>>      Russ> symbol versioning and could cause the symbols to be resolved
>>      Russ> weirdly in a way that could cause serious bugs.

>> Yeah, and so the symbol versions would not be present, and so I think
>> we're back to needing to know what symbols are from libc6. A better
>> error if we knew how to generate it efficiently would be using a libc
>> symbol without a symbol version.

> dlopen() itself is versioned to libc on the calling object, but there is
> no record of what is passed to it aside from a floating string somewhere
> in the ro section. Unless you parse the source or disassemble the .so,
> you can't find out what is being called.

> In other words, lintian can't find out if the library loads printf() for
> example. But you can test it to see if it works correctly :)

I think we may still not be on quite the same page about what I understand
the problem to be, and what the Lintian check is trying to detect. The
Lintian check is not aimed at the binary doing the dlopen() call. It's
aimed at the *.so file that is the target of the dlopen() call.

Warning: lots of simplification and hand-waving here. This is not fully
accurate to how things work, but is hopefully close enough to get the idea
across.

When shared code is loaded through dlopen(), its declared library
dependencies are resolved in a similar manner to when dynamic libraries
are loaded. Then, once those symbols are resolved, any remaining
unresolved symbols are resolved from the binary that called dlopen().

Historically we have relied on this behavior in, for example, Perl XS
modules which have unresolved references to the Perl interpreter symbols,
which can be resolved either from /usr/bin/perl (in the normal case) or
from libperl.so.5.40 (in the embedded Perl case). In other words, those
symbols are left *intentionally* unresolved because we do not want to
create a dependency on libperl because we do not want to load libperl into
/usr/bin/perl; we want the module symbols to be resolved against
/usr/bin/perl. (There are various reasons for this that mostly come down
to performance.)

The problem is that once you start linking modules using the flags that
allow them to leave the Perl symbols unresolved, you run the risk of also
leaving *other* shared library symbols unresolved, including,
specifically, libc.

Now, you may think that this is not really a problem, because the main
binary has a dependency on libc and thus those symbols will be resolved at
dlopen() time just like the Perl symbols. The problem is that, at least as
I remember this, the symbol version resolution happens at link time. The
binary only knows that it needs strerror@GLIBC_2.2.5 because it was linked
with libc. If it's *not* linked with libc, the unresolved symbol will just
be strerror without the version. That in turn means, I believe, that it
resolves to the latest version of strerror found in the process space,
which can be the wrong version of strerror if the ABI has changed. (It
hasn't for strerror, but it has for some other things.)

You can detect this from readelf on the module *.so file, but you can't
detect this by looking for the @GLIBC symbol versions because in the
unlinked case the symbol versioning doesn't happen. You have to look for
bare undefined strerror references with no symbol version, which would
indeed represent a bug. But you *don't* want to alert on the unresolved
Perl symbols, because those are intentional. So that's how we get back to
having to know which symbols are found in libc and therefore should have
attached symbol versions.

-- 
Russ Allbery (r...@debian.org)              <https://www.eyrie.org/~eagle/>

Reply via email to