Re: [PATCH] Allow elf header to be used when __BEGIN_DECLS and __END_DECLS aren't defined.

2020-10-27 Thread Florian Weimer via Elfutils-devel
* Érico Nogueira via Libc-alpha:

> This is mostly an initial PoC, and an additional comment for why this
> is needed could be added to the file.
>
> I accidentally sent the wrong patch a while ago, sorry!

extern "C++" does not have any effect for this header because it does
not declare anything that has language linkage.

I think you need to remove  to make the file self-contained.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-27 Thread Florian Weimer via Elfutils-devel
* Rich Felker:

> As I stated in my other reply, I'm opposed to that because it does not
> admit implementation with the same (very desirable) big-O properties,
> and the "extmatch" syntax is not widely known or widely used.

The syntax comes from ksh and is used in shell scripts.  (bash requires
the extglob option to enable it, which makes it easy to search for
instances.)

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Re: Writing core files to contain buildids

2020-10-27 Thread Mark Wielaard
Hi Paul,

On Mon, 2020-10-26 at 03:06 -0400, Paul Smith wrote:
> We use this for various reasons rather than relying on the Linux
> kernel
> coredumper and it's worked well for us for many years now.  However, I
> would like to start using build IDs in our shared libs and binaries and
> to take best advantage of that I need to ensure that the build ID
> values appear in the generated core files, which, at the moment, they
> do not unfortunately.
> 
> I'm hoping someone can point me to some information about what extra
> content I need to make sure is preserved in the generated coredump so
> that I can retrieve build IDs from it.

The basic idea behind getting buildids into core files is that they
(the GNU ELF notes) are at the start of the file in the first page that
is dumped (together with the phdrs) in the core file so when core file
consumers can walk the program headers and find the buildid note
(almost immediately following those headers).

A few more technical details can be found at:
https://fedoraproject.org/wiki/Releases/FeatureBuildId#Include_build_IDs_in_core_dumps

Hope that helps,

Mark


Re: [PATCH v2] Stop depending on strndupa.

2020-10-27 Thread Mark Wielaard
Hi Érico,

On Mon, 2020-10-26 at 15:33 -0300, Érico Nogueira via Elfutils-devel
wrote:
> From: Érico Rolim 
> 
> This functon has inherent safety issues, since a long enough path can
> lead to memory clobbering. Instead, limit the path length to
> PATH_MAX.

I like getting rid of strndupa, because it can accidentally blow up the
stack. But replacing it with an on stack array of PATH_MAX also carries
a risk of blowing up the stack because the function calls itself
recursively and now always uses a huge stack allocation for every
iteration. Can we simply use strndup and free (dir) when we are done
instead?

> Also add braces around while loop.

That is better, but note that GNU style is
while (...)
  {
...
  }
> Signed-off-by: Érico Rolim 

Thanks. Note that it is also missing a ChangeLog entry.

Cheers,

Mark


Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-27 Thread Mark Wielaard
On Tue, 2020-10-27 at 10:19 +0100, Florian Weimer via Elfutils-devel
wrote:
> * Rich Felker:
> 
> > As I stated in my other reply, I'm opposed to that because it does not
> > admit implementation with the same (very desirable) big-O properties,
> > and the "extmatch" syntax is not widely known or widely used.
> 
> The syntax comes from ksh and is used in shell scripts.  (bash requires
> the extglob option to enable it, which makes it easy to search for
> instances.)

Right, it is also adopted by zsh and some other shells. The big-O
properties don't really matter in this case because fnmatch is used on
small input strings like file names (or in this case section names).

Cheers,

Mark


Re: [PATCH] Fix leb128 reading

2020-10-27 Thread Mark Wielaard
Hi Tom,

On Mon, 2020-10-26 at 16:54 -0600, Tom Tromey wrote:
> > > In this patch, I chose to try to handle weird leb128 encodings by
> > > preserving their values when possible; or returning the maximum value
> > > on overflow.  It isn't clear to me that this is necessarily the best
> > > choice.
> 
> Mark> I think it is a good thing to accept "padded" leb128 numbers, but
> Mark> probably up to a point. The padding will probably be either mod-4 or
> Mark> mod-8 bytes. So lets at most read up to 16 bytes total.
> 
> Now I think all this is just overkill.  Actually-existing leb128
> relocations (see commit 7d81bc93 in binutils) seem to shrink the
> representation.  There was some talk in the Rust leb128 decoder issues
> about this topic, but that was a wish-list thing possibly for wasm -- so
> not exactly a need.

OK. Lets drop it then. It wasn't really supported earlier (not for more
than 10 bytes anyway). So it shouldn't regress anything.

> Mark> Is the really just (1u << (8 * 8 - 7 * 9)) - 1 = (1u << 1) - 1 = 1
> Mark> So we are really only interested in the lowest bit?
> 
> Yeah.  We could hard-code that if you prefer, but I wrote it this way to
> pretend that maybe the types in the code would be changed (widened) some
> day.

I would like at least a comment. It took me a while to see it really
was just 1, and it might not be immediately clear why that is the
correct value.

> > > +  if (unlikely (value == UINT64_MAX))
> > > +return INT64_MAX;
> 
> Mark> Here you need to help me out a bit, wouldn't UINT64_MAX also be the
> Mark> representation of -1?
> 
> Yes.  Normally in sleb128 I guess -1 would be written {0x7f}.  However
> one could write it with any number of 0xff prefix bytes first.  This
> weirdness is one reason I changed my mind on handling over-long leb128
> sequences.
> 
> I guess the only solution here is to write 2 copies of each function.

That is what I would do, simply because I get headaches when trying to
reason about these kind of signed/unsigned conversions. Of course you
would still have some of that in the sleb128 variant because you'll
have to use unsigned arithmetic and cast to signed at the end.

> Mark> But what I don't fully understand is how this works with a
> "padded"
> Mark> leb128, won't we possibly forget to negate in that case?
> 
> It isn't clear to me what to do when a sleb128 specifies more than 64
> bits.  Another reason perhaps to avoid this area.

There is not much we can do, except flag it as overflow. Any constant
value in DWARF that represents a > 64bit value is problematic (see for
example the DWARF5 DW_FROM_data16, which the spec claims is a
"constant", but we have to treat it as block form).

Cheers,

Mark


Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-27 Thread Rich Felker
On Tue, Oct 27, 2020 at 04:04:44PM +0100, Mark Wielaard wrote:
> On Tue, 2020-10-27 at 10:19 +0100, Florian Weimer via Elfutils-devel
> wrote:
> > * Rich Felker:
> > 
> > > As I stated in my other reply, I'm opposed to that because it does not
> > > admit implementation with the same (very desirable) big-O properties,
> > > and the "extmatch" syntax is not widely known or widely used.
> > 
> > The syntax comes from ksh and is used in shell scripts.  (bash requires
> > the extglob option to enable it, which makes it easy to search for
> > instances.)
> 
> Right, it is also adopted by zsh and some other shells. The big-O
> properties don't really matter in this case because fnmatch is used on
> small input strings like file names (or in this case section names).

They do because they're also in space, unless you want
exponential-time which is huge even on small inputs, and greater than
O(1) space requirement means the interface can't satisfy its contract
to return a conclusive result for valid inputs.

Rich


[PATCH v3] unstrip: Stop using strndupa.

2020-10-27 Thread Érico Nogueira via Elfutils-devel
From: Érico Rolim 

This functon has inherent safety issues, since a long enough path can
lead to memory clobbering. Due to the recursive nature of
make_directories(), multiple calls could also stack overflow. Instead,
the string can be allocated in the heap.

As a bonus, this improves musl compatibility, since musl doesn't include
the strndupa macro for now.

Also add braces around while loop.

Signed-off-by: Érico Rolim 
---
 ChangeLog |  4 
 src/unstrip.c | 16 +++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 72e8397c..f82010f8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2020-10-27  Érico N. Rolim  
+
+   * unstrip: Stop using strndupa.
+
 2020-10-01  Frank Ch. Eigler  
 
PR25461
diff --git a/src/unstrip.c b/src/unstrip.c
index a855038a..0257d9cc 100644
--- a/src/unstrip.c
+++ b/src/unstrip.c
@@ -311,12 +311,18 @@ make_directories (const char *path)
   if (lastslash == path)
 return;
 
-  char *dir = strndupa (path, lastslash - path);
+  char *dir = strndup (path, lastslash - path);
+  if (dir == NULL)
+error(EXIT_FAILURE, errno, _("memory exhausted"));
+
   while (mkdir (dir, 0777) < 0 && errno != EEXIST)
-if (errno == ENOENT)
-  make_directories (dir);
-else
-  error (EXIT_FAILURE, errno, _("cannot create directory '%s'"), dir);
+{
+  if (errno == ENOENT)
+make_directories (dir);
+  else
+error (EXIT_FAILURE, errno, _("cannot create directory '%s'"), dir);
+}
+  free (dir);
 }
 
 /* Keep track of new section data we are creating, so we can free it
-- 
2.29.0



Re: [PATCH v2] Stop depending on strndupa.

2020-10-27 Thread Érico Nogueira via Elfutils-devel
On Tue Oct 27, 2020 at 1:01 PM -03, Mark Wielaard wrote:
> Hi Érico,
>
> On Mon, 2020-10-26 at 15:33 -0300, Érico Nogueira via Elfutils-devel
> wrote:
> > From: Érico Rolim 
> > 
> > This functon has inherent safety issues, since a long enough path can
> > lead to memory clobbering. Instead, limit the path length to
> > PATH_MAX.
>
> I like getting rid of strndupa, because it can accidentally blow up the
> stack. But replacing it with an on stack array of PATH_MAX also carries
> a risk of blowing up the stack because the function calls itself
> recursively and now always uses a huge stack allocation for every
> iteration. Can we simply use strndup and free (dir) when we are done
> instead?

Good point, I hadn't paid attention to it being recursive. New patch is
using the heap.

>
> > Also add braces around while loop.
>
> That is better, but note that GNU style is
> while (...)
> {
> ...
> }

Fixed.

> > Signed-off-by: Érico Rolim 
>
> Thanks. Note that it is also missing a ChangeLog entry.
>
> Cheers,
>
> Mark

Thanks,
Érico


Re: Writing core files to contain buildids

2020-10-27 Thread Paul Smith
Thanks Mark,

On Tue, 2020-10-27 at 15:39 +0100, Mark Wielaard wrote:
> The basic idea behind getting buildids into core files is that they
> (the GNU ELF notes) are at the start of the file in the first page
> that is dumped (together with the phdrs) in the core file so when
> core file consumers can walk the program headers and find the buildid
> note (almost immediately following those headers).

Something is missing; if I check the binary I see a build ID in it.  I
can control whether the userspace coredumper or the kernel coredumper
is used via an environment variable: the kernel-generated core contains
a full set of build ID values including vdso.so, the binary, and all
normal .so's.

But when using the userspace coredumper I get a completely valid
coredump that GDB is happy with, for example, but there is only the
vdso.so build ID present: I don't see any build IDs from the binary or
shared libraries.

I guess another possibility is that they are there but eu-unstrip
--core can't find them?

The userspace coredumper uses the output of /proc/self/maps and
/proc/self/smaps to determine what memory to dump; the code is a bit
hairy but I will investigate what's going on.

The parsing of the maps starts here (I don't really expect anyone to
read this but if you're curious):

https://github.com/madscientist/google-coredumper/blob/1b64fbc282ba08654853f34c2bda00ffe5e26855/src/elfcore.c#L760

and the write-out starts here:

https://github.com/madscientist/google-coredumper/blob/1b64fbc282ba08654853f34c2bda00ffe5e26855/src/elfcore.c#L1019

> A few more technical details can be found at:
> https://fedoraproject.org/wiki/Releases/FeatureBuildId#Include_build_IDs_in_core_dumps

Thanks I did see that but I will read it harder.



Re: [PATCH v3] unstrip: Stop using strndupa.

2020-10-27 Thread Mark Wielaard
Hi Érico,

On Tue, Oct 27, 2020 at 04:19:58PM -0300, Érico Nogueira via Elfutils-devel 
wrote:
> This functon has inherent safety issues, since a long enough path can
> lead to memory clobbering. Due to the recursive nature of
> make_directories(), multiple calls could also stack overflow. Instead,
> the string can be allocated in the heap.
> 
> As a bonus, this improves musl compatibility, since musl doesn't include
> the strndupa macro for now.
> 
> Also add braces around while loop.

Looks good. Pushed.
I did move and fixup the ChangeLog entry.
(each subdir has its own ChangeLog and they should tell what exactly change)

Thanks,

Mark


Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-27 Thread Mark Wielaard
Hi Rich,

On Tue, Oct 27, 2020 at 01:08:17PM -0400, Rich Felker wrote:
> On Tue, Oct 27, 2020 at 04:04:44PM +0100, Mark Wielaard wrote:
> > Right, it is also adopted by zsh and some other shells. The big-O
> > properties don't really matter in this case because fnmatch is used on
> > small input strings like file names (or in this case section names).
> 
> They do because they're also in space, unless you want
> exponential-time which is huge even on small inputs, and greater than
> O(1) space requirement means the interface can't satisfy its contract
> to return a conclusive result for valid inputs.

But that isn't the contract if fnmatch. fnmatch returns 0 for a match
and non-zero for either a non-match or some error. So if your
algorithm hits some error case, like out of memory, returning a
non-zero result is fine.

I believe the extended wildcard pattern are widely supported and
useful. If you don't want to implement them because they aren't in any
standardized enough yet we can ask the Austin Group to add them to
fnmatch. They have adopted other GNU flags for fnmatch in the past.

Cheers,

Mark


Re: Writing core files to contain buildids

2020-10-27 Thread Mark Wielaard
Hi Paul,

On Tue, Oct 27, 2020 at 04:20:51PM -0400, Paul Smith wrote:
> On Tue, 2020-10-27 at 15:39 +0100, Mark Wielaard wrote:
> > The basic idea behind getting buildids into core files is that they
> > (the GNU ELF notes) are at the start of the file in the first page
> > that is dumped (together with the phdrs) in the core file so when
> > core file consumers can walk the program headers and find the buildid
> > note (almost immediately following those headers).
> 
> Something is missing; if I check the binary I see a build ID in it.  I
> can control whether the userspace coredumper or the kernel coredumper
> is used via an environment variable: the kernel-generated core contains
> a full set of build ID values including vdso.so, the binary, and all
> normal .so's.
> 
> But when using the userspace coredumper I get a completely valid
> coredump that GDB is happy with, for example, but there is only the
> vdso.so build ID present: I don't see any build IDs from the binary or
> shared libraries.
> 
> I guess another possibility is that they are there but eu-unstrip
> --core can't find them?

Do you have the generated core files somehwere so others can look at
them? How exactly are you testing the build-id notes are there?

Cheers,

Mark


Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-27 Thread Rich Felker
On Tue, Oct 27, 2020 at 11:19:11PM +0100, Mark Wielaard wrote:
> Hi Rich,
> 
> On Tue, Oct 27, 2020 at 01:08:17PM -0400, Rich Felker wrote:
> > On Tue, Oct 27, 2020 at 04:04:44PM +0100, Mark Wielaard wrote:
> > > Right, it is also adopted by zsh and some other shells. The big-O
> > > properties don't really matter in this case because fnmatch is used on
> > > small input strings like file names (or in this case section names).
> > 
> > They do because they're also in space, unless you want
> > exponential-time which is huge even on small inputs, and greater than
> > O(1) space requirement means the interface can't satisfy its contract
> > to return a conclusive result for valid inputs.
> 
> But that isn't the contract if fnmatch. fnmatch returns 0 for a match
> and non-zero for either a non-match or some error. So if your
> algorithm hits some error case, like out of memory, returning a
> non-zero result is fine.
> 
> I believe the extended wildcard pattern are widely supported and
> useful. If you don't want to implement them because they aren't in any
> standardized enough yet we can ask the Austin Group to add them to
> fnmatch. They have adopted other GNU flags for fnmatch in the past.

And I can ask them not to. Your hostility is really unwelcome here.

Rich


Re: Writing core files to contain buildids

2020-10-27 Thread Paul Smith
On Tue, 2020-10-27 at 23:23 +0100, Mark Wielaard wrote:
> Do you have the generated core files somehwere so others can look at
> them? How exactly are you testing the build-id notes are there?

I don't have one to publish but I can create one.  That's a good idea
anyway as it will be simpler to debug than the actual system which is
much more complex.

I'm using eu-unstrip -n --core core.xxx to check for build IDs.