On Tue, 17 May 2005, [EMAIL PROTECTED] yowled:
> On Saturday 14 May 2005 14:00, Nix wrote:
>> At any rate, specifying the paths explicitly won't do any harm
>> anywhere. :)
> On 2.6 we just removed -L/usr/lib from the linker command line  because of 
> some warnings.

EXPN? (Are they serious? What are they?)

>                So, we could conditionalize that (maybe -D_SEARCH_PATH=). I'm 
> not going to merge it in this form, so send me the cleaned up version.

If I could work out what to conditionalize it on, sure.

We could search only /lib, and skip /usr/lib: I think libutil is there
on every Linux platform, at least. (Do we care about portability to
non-Linux hosts? I don't *think* so...)

> Also, I'd like to ask some questions to you, my dear compiler guy.

I can try to answer, though I'm a tyro compared to the *real* compiler
gods. (Most of these questions seem binutils/glibc-related anyhow.)

> 0) When is RH going to drop LinuxThreads support (as opposed to NPTL)?

Um, ask RH? All I know is that Ulrich Drepper has said that the next
major release of glibc will not have LinuxThreads in it. Daniel
Jacobowitz has indicated a willingness to keep it alive separately.

I don't expect the distributors to drop it for a while: they care about
backward compatibility. (My systems at home have all dropped it, except
for those that run UML: nothing else I run seems to have cared.)

> 1) UML does not compile with TT mode enabled when the host has a NPTL-only 
> glibc installed.

Um, oo-er? (Good thing I've compiled in skas mode.)

>                  This showed up on Gentoo, and will show up on newer RedHat 
> since I heard they're dropping LinuxThreads away.

Sooner or later, I expect them to. I just can't say *when*. Only they
can do that. :)

But fundamentally, LinuxThreads isn't being actively maintained by
anyone, and NPTL is, and NPTL is better in just about every way (except
that it places more extreme demands on the kernel and binutils/ABI, so
some platforms don't support it yet). So eventually I expect it to die.

> Now, the problem we're currently having is due to this:
> 
> arch/um/kernel/uml.lds.S:
> 
>   .thread_private : {
>     __start_thread_private = .;
>     errno = .;
>     . += 4;
>     arch/um/kernel/tt/unmap_fin.o (.data)
>     __end_thread_private = .;
>   }
> 
> That gives a "overflow of program headers, allocated #n needed #m > n" (more 
> or less). Would that be fixable at the linker script level? What it does is 
> to provide a errno definition for that segment of code since it remaps the 
> code while UML is executing.

eee-ick. That'll really collide nastily with glibc's assumption that
errno is located in thread-local storage, I fear.

In fact, glibc's assumptions are even worse. I spent entirely too long
working this stuff out early this year, and because misery loves company
I'll spread the word as best I understand it. (I think only Roland,
Ulrich and possibly a few other deities of that order understand it
fully.)

- inside the dynamic linker is a hidden symbol rtld_errno
  (see glibc/include/errno.h). This is what ld-linux.so.2 uses for errno.
  Nobody else can see it: I mention it only for completeness.

- outside libc, errno expands to a call to the function __errno_location()
  (see e.g. glibc/sysdeps/unix/sysv/linux/bits/errno.h)

- in libc proper, __errno_location returns, um, the address of the errno
  variable. This seems like a waste of time, but it means that the errno
  variable can be *hidden* from outside glibc, and its address can
  change. And it does. (see glibc/sysdeps/generic/errno-loc.c).

- under LinuxThreads, in single-threaded programs, `errno' is a real
  variable, and in multi-threaded programs on non-TLS platforms or those
  not supporting __thread, __errno_location() is overridden by libpthread
  to return a value appropriate to the thread using LinuxThreads's nasty
  and inefficient approximation to TLS, which requires
  (see glibc/linuxthreads/errno.c.)

- Under NPTL, and under LinuxThreads on a __thread, TLS-supporting
  platform (i.e., on glibc compiled for i686 and above, with GCC 3.4+),
  glibc exports a GLIBC_PRIVATE thread_local symbol named errno,
  which __errno_location() picks up. This symbol is exported in the
  GLIBC_2.0 symbol version set if TLS is not in use: if it is, then
  that symbol has no sensible value, and it's not exported at all.

Note that the actual storage of errno under NPTL is a matter of
cooperation between GCC, glibc and GNU ld: glibc declares errno with the
__thread storage class specifier, and GCC proceeds to emit a stereotyped
instruction sequence with suitable relocations (see
gcc-3.4.x/gcc/config/i386/i386.md, search for `Thread-local') which
CPU-specific code in the libbfd library used by GNU ld carefully relaxes
such that each reference to a given TLS symbol gets only one reference
across the entire binary or shared object (see, e.g.,
binutils/bfd/elf32-i386.c: elf32_i386_check_relocs() is most
interesting).  Once that's done, the dynamic linker does further black
magic I haven't looked into to make sure that all those TLS relocations
resolve to a single symbol across the entire app (in much the same way,
I assume, as it does to any other symbol).


Messing with this intricate dance is... dangerous, and merely working on
one platform doesn't mean it mightn't fail on other (perhaps glibc was
built with LinuxThreads on one machine, NPTL on that one over there,
configured with --without-__thread on this box, and not on that: it's
enough to make your head spin).

Plus, of course, you can expect the usual degree of assistance from
Ulrich if you do something zany and the dance breaks
down... (i.e. `don't do that, then' if you're lucky)

> I tried time ago, but failed, so I was curious.
> 
> Actually, I recently learned another way to do this, but I was still curious:
> 
> it was to change arch/um/kernel/tt/unmap.c to inline the syscalls and avoid 
> using errno at all (it's used in the SKAS test program I posted).

This seems much safer: glibc's assumptions about errno are very
complicated these days (see above!), and playing with errno in any way
outside those the standard allows is liable to be broken by Ulrich
without warning. (Even some of the documented ways to build glibc are
broken due to errno problems...)

> 2) /usr/lib/libpcap.so ships with a global symbol named "vmap" which overlaps 
> with the kernel vmap() function.

Joy. Thankfully you have many ways around this: ELF is flexible. Alias
symbols may be the simplest (alias vmap() to something else and refer to
the something else, or use symbol versions to declare vmap() as private
to this executable, or even, ick, #define it to something else. Or the
approach outlined at the end of this mail, which is arguably even
uglier).

(Um, why is UML linking against libpcap? Or don't I want to know?)

> The questions are:
>  - shouldn't common symbols in libraries be forbidden

Since GCC generates unintialized non-static globals that way (because it
saves space in the executable), er, no. Normally you *want* globals to
be, er, global. :)

>  - I've verified that that symbol could be static (it's a var. of a type 
> declared in the object and there are no relocation pointing to it, and the 
> linker is not allowed to resolve the relocations there). Isn't that stupid? 

Yes :/ Feel free to patch libpcap's optimize.c: it's the definer and the
only consumer of vmap, so it should be static. (It doesn't even need to
be hidden!)

> Would this be fixable at distro's level?

Yes. It's arguably a (small) libpcap bug.

> (Note that I already played dirty linker tricks as in 
> arch/um/kernel/tt/Makefile).

You can get in the way of that symbol resolution by creating a shared
library (libpcapture.so?) with the -Wl,--auxiliary flag, specifying
libpcap in there and linking against libpcapture: symbols defined in
libpcapture will be used in preference to those in libpcap, so you
could capture vmap() in there and call the original kernel function,
though that means exporting that function from the main executable
and it may get more annoying than it's worth.

(I can send you a trivial example of this sort of weird trickery if you
want, but not in this mail as it's past midnight and I need some
sleep. Alarm goes off at 6:30am. Working for banks sucks sometimes... :/
)

-- 
`End users are just test loads for verifying that the system works, kind of
 like resistors in an electrical circuit.' - Kaz Kylheku in c.o.l.d.s


-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
_______________________________________________
User-mode-linux-user mailing list
User-mode-linux-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user

Reply via email to