Re: PATCH: Report libpq version and configuration

Craig Ringer Mon, 26 Oct 2020 21:50:03 -0700

On Tue, Oct 27, 2020 at 12:56 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvhe...@alvh.no-ip.org> writes:
> > On 2020-Oct-26, Craig Ringer wrote:
> >> also adds PQlibInfoPrint() which dumps PQlibInfo() keys/values to stdout.
>
> > Sounds useful. I'd have PQlibInfoPrint(FILE *) instead, so you can pass
> > stdout or whichever fd you want.
>
> +1.  Are we concerned about translatability of these strings?  I think
> I'd vote against, as it would complicate applications, but it's worth
> thinking about it now not later.



It's necessary not to translate the key names, they are identifiers
not descriptive text. I don't object to having translations too, but
the translation teams have quite enough to do already with user-facing
text that will get regularly seen. So while it'd be potentially
interesting to expose translated versions too, I'm not entirely
convinced. It's a bit like translating macro names. You could, but ...
why?

> >> Patch 0002 exposes LIBPQ_VERSION_STR, LIBPQ_VERSION_NUM and
> >> LIBPQ_CONFIGURE_ARGS symbols in the dynamic symbol table. These can be
> >> accessed by a debugger even when the library cannot be loaded or executed,
> >> and unlike macros are available even in a stripped executable. So they can
> >> be used to identify a libpq binary found in the wild. Their storage is
> >> shared with PQlibInfo()'s static data, so they only cost three symbol table
> >> entries.
>
> > Interesting.  Is this real-world useful?
>
> -1, I think this is making way too many assumptions about the content
> and format of a shlib.


I'm not sure I understand what assumptions you're concerned about or
their consequences. On any ELF it should be just fine, and Mach-O
should be too. I do need to check that MSVC generates direct symbols
for WIN32 PE, not indirect thunked data symbols.

It doesn't help that I failed to supply the final revision of this
patch, which does this:

-const char * const LIBPQ_VERSION_STR = PG_VERSION_STR;
+const char LIBPQ_VERSION_STR[] = PG_VERSION_STR;

-const char * const LIBPQ_CONFIGURE_ARGS = CONFIGURE_ARGS;
+const char LIBPQ_CONFIGURE_ARGS[] = CONFIGURE_ARGS;

... to properly ensure the string symbols go into the read-only data section:

$ eu-nm --defined-only -D $LIBPQ | grep LIBPQ_
LIBPQ_CONFIGURE_ARGS           |0000000000028640|GLOBAL|OBJECT
|00000000000000e4|  libpq-version.c:74|.rodata
LIBPQ_VERSION_NUM              |0000000000028620|GLOBAL|OBJECT
|0000000000000004|  libpq-version.c:75|.rodata
LIBPQ_VERSION_STR              |0000000000028740|GLOBAL|OBJECT
|000000000000006c|  libpq-version.c:73|.rodata

I don't propose these to replace information functions or macros, I'm
suggesting we add them as an aid to tooling and for debugging. I have
had quite enough times when I've faced a mystery libpq, and it's not
always practical in a given target environment to just compile a tool
to print the version.

In addition to easy binary identification, having symbolic references
to the version info is useful for dynamic tracing tools like perf and
systemtap - they cannot execute functions directly in the target
address space, but they can read data symbols. I actually want to
expose matching symbols in postgres itself, for the use of dynamic
tracing utilities, so they can autodetect the target postgres at
runtime even without -ggdb3 level debuginfo with macros, and correctly
adapt to version specifics of the target postgres.

In terms of standard tooling here are some different ways you can get
this information symbolically.

$ LIBPQ=/path/to/libpq.so

$ gdb -batch -ex 'p (int) LIBPQ_VERSION_NUM' -ex 'p (const char *)
LIBPQ_VERSION_STR' $LIBPQ
$1 = 140000
$2 = "PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (GCC)
10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit"

$ perl getpqver.pl $LIBPQ
LIBPQ_VERSION_NUM=140000
LIBPQ_VERSION_STR=PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled
by gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit

I've attached getpqver.pl. It uses eu-nm from elfutils to get symbol
offset and length, which is pretty standard stuff. And it's quite
simple to adapt it to use legacy binutils "nm" by invoking

    nm --dynamic --defined -S $LIBPQ

and tweaking the reader.

If you really want something strings-able, I'm sure that's reasonably
feasible, but I don't think it's particularly unreasonable to expect
to be able to inspect the symbol table using appropriate platform
tools or a simple debugger command.

> Again, I'm not exactly excited about this.  I do not one bit like
> patches that assume that x64 linux is the universe, or at least
> all of it that need be catered to.  Reminds me of people who thought
> Windows was the universe, not too many years ago.

Yeah. I figured you'd say that, and don't disagree. It's why I split
this patch out - it's kind of a sacrificial patch.

I actually wrote this part first.

Then I wrote  PQlibInfo() when I realised that there was no sensible
pre-existing way to get the information I wanted to dump from libpq at
the API level, and adapted the executable .so output to call it.

> I'd rather try to set this up so that some fairly standard tooling
> like "strings" + "grep" can be used to pull out the info.  Sure,
> it would be less convenient, but honestly how often is this really
> going to be necessary?


eu-readelf and objdump are pretty standard tooling. But I really don't
much care if the executable .so hack gets in, it's mostly a fun PoC.
If you can execute libpq then the dynamic linker must be able to load
it and resolve its symbols, in which case you can probably just as
easily do this:

    python -c "import sys, ctypes;
ctypes.cdll.LoadLibrary(sys.argv[1]).PQlibInfoPrint()"
build/src/interfaces/libpq/libpq.so

or compile and run a trivial C one-liner.

As much as anything I thought it was a good way to stimulate
discussion and give you something easy to reject ;)

getpqver.pl
Description: Perl program

Re: PATCH: Report libpq version and configuration

Reply via email to