I wrote: > So this is all pretty much of a mess. If we annotate the elog functions > differently from printf's annotation then we risk getting these complaints > in elog.c, but if we don't do that then we can't really describe their > semantics correctly. We could possibly mark the replacement snprintf > functions with gnu_printf, but that's a lie with respect to the very > point at issue about %m. Unless we were to teach snprintf.c about %m > ... which probably wouldn't be hard, but I'm not sure I want to go there.
Actually ... the more I think about this, the less insane that idea seems. Consider the following approach: 1. Teach src/port/snprintf.c about %m. While I've not written a patch for this, it looks pretty trivial. 2. Teach configure to test for %m and if it's not there, use the replacement snprintf. (Note: we're already forcing snprintf replacement in cross-compiles, so the added run-time test isn't losing anything.) 3. Get rid of elog.c's hand-made substitution of %m strings, and instead just let it pass the correct errno value down. (We'd likely need to do some fooling in appendStringInfoVA and related functions to preserve call-time errno, but that's not complicated, nor expensive.) 4. (Optional) Get rid of strerror(errno) calls in favor of %m, even in frontend code. Once we've done this, we have uniform printf semantics across all platforms, which is kind of nice from a testing standpoint, as well as being less of a cognitive load for developers. And we can stick with the existing approach of using the gnu_printf archetype across the board; that's no longer a lie for the snprintf.c code. One objection to this is the possible performance advantage of the native printf functions over snprintf.c. I did a bit of investigation of that using the attached testbed, and found that the quality of implementation of the native functions seems pretty variable: RHEL6's glibc on x86_64 (this is just a comparison point, since we'd not be replacing glibc's printf anyway): snprintf time = 756.795 ms total, 0.000756795 ms per iteration pg_snprintf time = 824.643 ms total, 0.000824643 ms per iteration macOS High Sierra on x86_64: snprintf time = 264.071 ms total, 0.000264071 ms per iteration pg_snprintf time = 348.41 ms total, 0.00034841 ms per iteration FreeBSD 11.0 on x86_64: snprintf time = 628.873 ms total, 0.000628873 ms per iteration pg_snprintf time = 606.684 ms total, 0.000606684 ms per iteration OpenBSD 6.0 on x86_64 (same hardware as FreeBSD test): snprintf time = 331.245 ms total, 0.000331245 ms per iteration pg_snprintf time = 539.849 ms total, 0.000539849 ms per iteration NetBSD 8.99 on armv6: snprintf time = 2423.39 ms total, 0.00242339 ms per iteration pg_snprintf time = 3769.16 ms total, 0.00376916 ms per iteration So we would be taking a hit on most platforms, but I've not really seen sprintf as a major component of very many profiles. Moreover, at least for the elog/ereport use-case, we'd be buying back some nontrivial part of that hit by getting rid of expand_fmt_string(). Also worth noting is that we've never made any effort at all to micro-optimize snprintf.c, so maybe there's some gold to be mined there to reduce the penalty. A different objection, possibly more serious than the performance one, is that if we get in the habit of using %m in frontend code then at some point we'd inevitably back-patch such a usage. (Worse, it'd pass testing on glibc platforms, only to fail elsewhere.) I don't see a bulletproof answer to that except to back-patch this set of changes, which might be a bridge too far. Aside from the back-patching angle, though, this seems pretty promising. Thoughts? regards, tom lane PS: here's the testbed I used for the above numbers. Feel free to try other platforms or other test-case formats. Compile this with something like gcc -Wall -O2 -I pgsql/src/include -I pgsql/src/port timeprintf.c (point the -I switches into a configured PG source tree); you might need to add "-lrt" or some such to get clock_gettime(). Then run with "./a.out 1000000" or so.
#include "postgres_fe.h" #include "portability/instr_time.h" #include "snprintf.c" int main(int argc, char **argv) { int count = 0; char buffer[1000]; instr_time start; instr_time stop; double elapsed; int i; if (argc > 1) count = atoi(argv[1]); if (count <= 0) count = 1000000; INSTR_TIME_SET_CURRENT(start); for (i = 0; i < count; i++) { snprintf(buffer, sizeof(buffer), "%d %g %s", 42, 42.2, "01234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890"); } INSTR_TIME_SET_CURRENT(stop); INSTR_TIME_SUBTRACT(stop, start); elapsed = INSTR_TIME_GET_MILLISEC(stop); printf("snprintf time = %g ms total, %g ms per iteration\n", elapsed, elapsed / count); INSTR_TIME_SET_CURRENT(start); for (i = 0; i < count; i++) { pg_snprintf(buffer, sizeof(buffer), "%d %g %s", 42, 42.2, "01234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890"); } INSTR_TIME_SET_CURRENT(stop); INSTR_TIME_SUBTRACT(stop, start); elapsed = INSTR_TIME_GET_MILLISEC(stop); printf("pg_snprintf time = %g ms total, %g ms per iteration\n", elapsed, elapsed / count); return 0; }