On Tue, Oct 27, 2020 at 12:56 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > Alvaro Herrera <alvhe...@alvh.no-ip.org> writes: > > On 2020-Oct-26, Craig Ringer wrote: > >> also adds PQlibInfoPrint() which dumps PQlibInfo() keys/values to stdout. > > > Sounds useful. I'd have PQlibInfoPrint(FILE *) instead, so you can pass > > stdout or whichever fd you want. > > +1. Are we concerned about translatability of these strings? I think > I'd vote against, as it would complicate applications, but it's worth > thinking about it now not later.
It's necessary not to translate the key names, they are identifiers not descriptive text. I don't object to having translations too, but the translation teams have quite enough to do already with user-facing text that will get regularly seen. So while it'd be potentially interesting to expose translated versions too, I'm not entirely convinced. It's a bit like translating macro names. You could, but ... why? > >> Patch 0002 exposes LIBPQ_VERSION_STR, LIBPQ_VERSION_NUM and > >> LIBPQ_CONFIGURE_ARGS symbols in the dynamic symbol table. These can be > >> accessed by a debugger even when the library cannot be loaded or executed, > >> and unlike macros are available even in a stripped executable. So they can > >> be used to identify a libpq binary found in the wild. Their storage is > >> shared with PQlibInfo()'s static data, so they only cost three symbol table > >> entries. > > > Interesting. Is this real-world useful? > > -1, I think this is making way too many assumptions about the content > and format of a shlib. I'm not sure I understand what assumptions you're concerned about or their consequences. On any ELF it should be just fine, and Mach-O should be too. I do need to check that MSVC generates direct symbols for WIN32 PE, not indirect thunked data symbols. It doesn't help that I failed to supply the final revision of this patch, which does this: -const char * const LIBPQ_VERSION_STR = PG_VERSION_STR; +const char LIBPQ_VERSION_STR[] = PG_VERSION_STR; -const char * const LIBPQ_CONFIGURE_ARGS = CONFIGURE_ARGS; +const char LIBPQ_CONFIGURE_ARGS[] = CONFIGURE_ARGS; ... to properly ensure the string symbols go into the read-only data section: $ eu-nm --defined-only -D $LIBPQ | grep LIBPQ_ LIBPQ_CONFIGURE_ARGS |0000000000028640|GLOBAL|OBJECT |00000000000000e4| libpq-version.c:74|.rodata LIBPQ_VERSION_NUM |0000000000028620|GLOBAL|OBJECT |0000000000000004| libpq-version.c:75|.rodata LIBPQ_VERSION_STR |0000000000028740|GLOBAL|OBJECT |000000000000006c| libpq-version.c:73|.rodata I don't propose these to replace information functions or macros, I'm suggesting we add them as an aid to tooling and for debugging. I have had quite enough times when I've faced a mystery libpq, and it's not always practical in a given target environment to just compile a tool to print the version. In addition to easy binary identification, having symbolic references to the version info is useful for dynamic tracing tools like perf and systemtap - they cannot execute functions directly in the target address space, but they can read data symbols. I actually want to expose matching symbols in postgres itself, for the use of dynamic tracing utilities, so they can autodetect the target postgres at runtime even without -ggdb3 level debuginfo with macros, and correctly adapt to version specifics of the target postgres. In terms of standard tooling here are some different ways you can get this information symbolically. $ LIBPQ=/path/to/libpq.so $ gdb -batch -ex 'p (int) LIBPQ_VERSION_NUM' -ex 'p (const char *) LIBPQ_VERSION_STR' $LIBPQ $1 = 140000 $2 = "PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit" $ perl getpqver.pl $LIBPQ LIBPQ_VERSION_NUM=140000 LIBPQ_VERSION_STR=PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit I've attached getpqver.pl. It uses eu-nm from elfutils to get symbol offset and length, which is pretty standard stuff. And it's quite simple to adapt it to use legacy binutils "nm" by invoking nm --dynamic --defined -S $LIBPQ and tweaking the reader. If you really want something strings-able, I'm sure that's reasonably feasible, but I don't think it's particularly unreasonable to expect to be able to inspect the symbol table using appropriate platform tools or a simple debugger command. > Again, I'm not exactly excited about this. I do not one bit like > patches that assume that x64 linux is the universe, or at least > all of it that need be catered to. Reminds me of people who thought > Windows was the universe, not too many years ago. Yeah. I figured you'd say that, and don't disagree. It's why I split this patch out - it's kind of a sacrificial patch. I actually wrote this part first. Then I wrote PQlibInfo() when I realised that there was no sensible pre-existing way to get the information I wanted to dump from libpq at the API level, and adapted the executable .so output to call it. > I'd rather try to set this up so that some fairly standard tooling > like "strings" + "grep" can be used to pull out the info. Sure, > it would be less convenient, but honestly how often is this really > going to be necessary? eu-readelf and objdump are pretty standard tooling. But I really don't much care if the executable .so hack gets in, it's mostly a fun PoC. If you can execute libpq then the dynamic linker must be able to load it and resolve its symbols, in which case you can probably just as easily do this: python -c "import sys, ctypes; ctypes.cdll.LoadLibrary(sys.argv[1]).PQlibInfoPrint()" build/src/interfaces/libpq/libpq.so or compile and run a trivial C one-liner. As much as anything I thought it was a good way to stimulate discussion and give you something easy to reject ;)
getpqver.pl
Description: Perl program