On 2024/03/03 14:29, Stuart Henderson wrote:
> On 2024/03/03 13:19, Lucas Gabriel Vuotto wrote:
> > On Sun, Mar 03, 2024 at 11:58:51AM +0000, Stuart Henderson wrote:
> > > On 2024/03/02 14:46, Theo de Raadt wrote:
> > > > Is this a situation where two libc's are being loaded into the address
> > > > space?  And the 2nd one is refused for pinsyscalls & msyscall, etc etc.
> > > 
> > > It seems the most likely cause. Console output from running with
> > > LD_DEBUG set in the environment would probably confirm (and would be
> > > more useful than kdump).
> > 
> > See end of this mail.
> > 
> > > I can't replicate it here on a system with new libc (I only tried
> > > starting gajim and poking in the UI, not connecting to any servers).
> > 
> > ftr, I don't even get to the UI.
> 
> Ah, I can replicate if I ldconfig -R.
> 
> > > I'm a bit surprised why a mixture of libs would happen there at all
> > > (unless something had been rebuilt locally) but don't see another reason
> > > to hit the msyscall error.
> > 
> > Nothing has been locally rebuilt.
> > 
> > LD_DEBUG indeed shows that libc.so.98.0 is loaded and libc.so.99.0 is
> > attempted to load.
> 
> <snip>
> > dlsym: gtk_get_minor_version in /usr/local/lib/libgtk-3.so.2201.0: 
> > 0x17287b9f300
> > dlsym: gtk_get_micro_version in /usr/local/lib/libgtk-3.so.2201.0: 
> > 0x17287b9f330
> > dlsym: pango_version_string in /usr/local/lib/libpango-1.0.so.3801.4: 
> > 0x172ed038d60
> > dlopen: loading: libc.so.99.0
> > msyscall 1732a806000 a8000 error
> 
> Coming from ...
> 
> Breakpoint 1.1, dlopen (libname=0x98b61cf06e0 "libc.so.99.0", flags=2) at 
> /usr/src/libexec/ld.so/dlfcn.c:64
> 64            if (flags & ~OK_FLAGS) {
> (gdb) bt
> #0  dlopen (libname=0x98b61cf06e0 "libc.so.99.0", flags=2) at 
> /usr/src/libexec/ld.so/dlfcn.c:64
> #1  0x0000098b93dc7d01 in py_dl_open () from 
> /usr/local/lib/python3.10/lib-dynload/_ctypes.cpython-310.so
> #2  0x0000098bb0dc1bc1 in cfunction_call () from 
> /usr/local/lib/libpython3.10.so.0.0
> #3  0x0000098bb0d6a132 in _PyObject_MakeTpCall () from 
> /usr/local/lib/libpython3.10.so.0.0
> <snip>
> 
> so something is doing dlopen("libc.so.99.0", RTLD_NOW) ...
> 
> (gdb) py-bt
> Traceback (most recent call first):
>   <built-in method dlopen of module object at remote 0xce92ca2bab0>
>   File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
>     self._handle = _dlopen(self._name, mode)
>   File "/usr/local/lib/python3.10/site-packages/gajim/main.py", line 147, in 
> _set_proc_title
>     libc = CDLL(find_library('c'))
>   File "/usr/local/lib/python3.10/site-packages/gajim/main.py", line 168, in 
> run
>     _set_proc_title()
>   File "/usr/local/bin/gajim", line 8, in <module>
>     sys.exit(run())
> 
> aha: gajim is calling setproctitle via ctypes, which dlopen()'s libc.so
> (without a specific version number). ld.so is picking the latest and
> loading it, but libc.so.98.0 was already loaded, so we hit msyscall
> error.

oh, it's not ld.so which is picking the latest version, it's python's
ctypes code, which parses the output of "ldconfig -r" to decide.

I don't think there's anything we can sanely do in ld.so to work
around this.

Reply via email to