"Olivier Meyer" <[EMAIL PROTECTED]> writes:

> Most of what you see is the libc setting up default signal stuff.
> After the ELF is loaded mprotect is used to make the area executable,
> so when EIP is set to the starting point, the program does not SEGV.

Erm. No. Sorry, not correct at all. Wouldn't it be nice to label
speculation with "I'm speculating here, but I think ..." instead of
misleading people?

The stuff that the kernel does to start the program ends on the line
that says:

> >   9465 prog     RET   execve 0

Everything after this is running in userland. First in ld.so and then libc.

> > Then comes stuff I don't really understand -
> >
> >   9465 prog     CALL  mquery(0,0x82000,0x5,0,0x3,0,0,0)
> >   9465 prog     RET   mquery 217501696/0xcf6d000
> >   9465 prog     CALL  mquery(0x2cf6d000,0xd000,0x1,0x10,0xffffffff,0,0,0)
> >   9465 prog     RET   mquery 754372608/0x2cf6d000
> >   9465 prog     CALL  mquery(0x2cf7a000,0x3000,0x3,0x10,0xffffffff,0,0,0)
> >   9465 prog     RET   mquery 754425856/0x2cf7a000
> >   9465 prog     CALL  mquery(0x2cf7d000,0x2000,0x3,0x10,0xffffffff,0,0,0)
> >   9465 prog     RET   mquery 754438144/0x2cf7d000
> >   9465 prog     CALL  mquery(0x2cf7f000,0x1000,0x3,0x10,0xffffffff,0,0,0)
> >   9465 prog     RET   mquery 754446336/0x2cf7f000
> >   9465 prog     CALL  mquery(0x2cf80000,0x1e000,0x3,0x10,0xffffffff,0,0,0)
> >   9465 prog     RET   mquery 754450432/0x2cf80000
> >   9465 prog     CALL  mmap(0xcf6d000,0x82000,0x5,0x12,0x3,0,0,0)
> >   9465 prog     RET   mmap 217501696/0xcf6d000
> >   9465 prog     CALL  mmap(0x2cf6d000,0xd000,0x1,0x12,0x3,0,0x82000,0)
> >   9465 prog     RET   mmap 754372608/0x2cf6d000
> >   9465 prog     CALL  mmap(0x2cf7a000,0x3000,0x3,0x12,0x3,0,0x8f000,0)
> >   9465 prog     RET   mmap 754425856/0x2cf7a000
> >   9465 prog     CALL  mmap(0x2cf7d000,0x2000,0x3,0x12,0x3,0,0x91000,0)
> >   9465 prog     RET   mmap 754438144/0x2cf7d000
> >   9465 prog     CALL  mmap(0x2cf7f000,0x1000,0x3,0x12,0x3,0,0x92000,0)
> >   9465 prog     RET   mmap 754446336/0x2cf7f000
> >   9465 prog     CALL  mmap(0x2cf80000,0x1e000,0x3,0x1012,0xffffffff,0,0,0)
> >   9465 prog     RET   mmap 754450432/0x2cf80000
> >   9465 prog     CALL  close(0x3)
> >   9465 prog     RET   close 0

This is where a library (in this case libc) is mapped into memory by
ld.so. It's looking this complicated because instead of allocating one
chunk of memory and doing one mapping to map the whole library, we
want to map all sections separately to satisfy the various memory
protection needs. Especially since on i386 (where you seem to be
running), there is a 512MB gap between the text and data sections in
the library because of how execute protection is handled. If you're
really interested, you can read libexec/ld.so/library_mquery.c, the
function is _dl_tryload_shlib().

> >   9465 prog     CALL  mprotect(0xcf6d000,0x81d56,0x7)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0x2cf6d000,0xc3a1,0x3)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0xcf6d000,0x81d56,0x5)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0x2cf6d000,0xc3a1,0x1)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0xcf6d000,0x81d56,0x7)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0x2cf6d000,0xc3a1,0x3)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0xcf6d000,0x81d56,0x5)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0x2cf6d000,0xc3a1,0x1)
> >   9465 prog     RET   mprotect 0
> >   9465 prog     CALL  mprotect(0x2cf7d000,0x2000,0x1)
> >   9465 prog     RET   mprotect 0

After the library is mapped, it needs to be relocated. Since the
default mappings are done with write permissions disabled and some
relocations need to write into the various sections with relocation
information, writes must be temporarily enabled for those mappings,
then when relocation is done, it needs to be disabled again. Why this
is being done twice seems magic. There is probably a reason I don't
remember or have never been aware of or some bug here.

> > What puzles me most is the subsequent storm of sigprocmask():
> > what are these really for? Who is really doing this - my prog
> > doesn't really chagnge its sigset.

This is caused by lazy binding in dynamic relocation. Since it takes a
lot of work to do symbol lookup, the dynamic linker doesn't resolve a
big part of the symbols it needs at startup since that would slow down
program startup a lot. Instead the cost is amortized over the whole
runtime of the program simply by making all yet unresolved function
calls to jump into ld.so and do the binding.

There are two problems with this. One is that memory is protected from
writing, that's why there are those calls to mprotect all over the
place. The other problem is that dynamic binding isn't reentrant, we
don't want signals (that might perform their own lazy bindings) while
doing this. That's why there are those calls to sigprocmask there. This
is slow and some applications speed up substantially when it's
disabled, but many "modern" applications have come to rely on this
behavior for doing their own tricks with dlopen(), so it can't really
be disabled.

The functions that are being relocated in the last part of the ktrace
is everything that libc does on startup, then a relocation of main(),
and after main returns, some relocations needed for cleanup and a
relocation of exit(). Then the final syscall:

> >   9465 prog     CALL  exit(0)

If you ktrace a statically linked program. You'll see everything that
libc actually does to set up and tear down your program. If you check
carefully in the output of your dynamically linked program, you'll
find all this behind the noise from ld.so.

$ cat > foo.c
int main() { return 0; }
$ cc -static -o foo foo.c
$ ktrace ./foo
$ kdump
  2153 ktrace   RET   ktrace 0
  2153 ktrace   CALL  execve(0x7f7fffff910f,0x7f7fffff8c78,0x7f7fffff8c88)
  2153 ktrace   NAMI  "./foo"
  2153 foo      EMUL  "native"
  2153 foo      RET   execve 0

Userland execution starts here.

  2153 foo      CALL  __sysctl(0.0,0x801360,0x7f7ffffe62b0,0,0)
  2153 foo      RET   __sysctl 0

Here the program fetches a random number to set up the canary for
the stack protector.

  2153 foo      CALL  mmap(0,0x1000,0x3,0x1002,0xffffffff,0,0)
  2153 foo      RET   mmap 1192062976/0x470d7000

Here a page is allocated for atexit function pointers...

  2153 foo      CALL  mprotect(0x470d7000,0x1000,0x1)
  2153 foo      RET   mprotect 0

...and then this page is protected to be read-only to avoid attacks that
change atexit function pointers.

Here, where you don't get syscalls logged in ktrace, main is called. Then
it returns, so exit() is called. exit() processes all the atexit hooks
and then unmaps the atexit page and exits the program.

  2153 foo      CALL  munmap(0x470d7000,0x1000)
  2153 foo      RET   munmap 0
  2153 foo      CALL  exit(0)
$ 

//art

ps. Yes, it's a slow day at work, so I have time to talk too much.

Reply via email to