On Sat, May 08, 2010 at 09:24:45PM -0400, Tom Lane wrote: > hgonza...@gmail.com writes: > > http://sources.redhat.com/bugzilla/show_bug.cgi?id=649 > > > The last explains why they do not consider it a bug: > > > ISO C99 requires for %.*s to only write complete characters that fit below > > the > > precision number of bytes. If you are using say UTF-8 locale, but ISO-8859-1 > > characters as shown in the input file you provided, some of the strings are > > not valid UTF-8 strings, therefore sprintf fails with -1 because of the > > encoding error. That's not a bug in glibc. > > Yeah, that was about the position I thought they'd take.
GNU libc eventually revisited that conclusion and fixed the bug through commit 715a900c9085907fa749589bf738b192b1a2bda5. RHEL 7.1 is fixed, but RHEL 6.6 and RHEL 5.11 are still affected; the bug will be relevant for another 8+ years. > So the bottom line here is that we're best off to avoid %.*s because > it may fail if the string contains data that isn't validly encoded > according to libc's idea of the prevailing encoding. Yep. Immediate precisions like %.10s trigger the bug as effectively as %.*s, so tarCreateHeader() [_tarWriteHeader() in 9.2 and earlier] is also affected. Switching to strlcpy(), as attached, fixes the bug while simplifying the code. The bug symptom is error 'pg_basebackup: unrecognized link indicator "0"' when the name of a file in the data directory is not a valid multibyte string. Commit 6dd9584 introduced a new use of .*s, to pg_upgrade. It works reliably for now, because it always runs in the C locale. pg_upgrade never calls set_pglocale_pgservice() or otherwise sets its permanent locale. It would be natural for us to fix that someday, at which point non-ASCII database names would perturb this status output. It would be good to purge the code of precisions on "s" conversion specifiers, then Assert(!pointflag) in fmtstr() to catch new introductions. I won't plan to do it myself, but it would be a nice little defensive change.
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl index 0e4bd12..f46c5fc 100644 --- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl +++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl @@ -17,6 +17,12 @@ command_fails( [ 'pg_basebackup', '-D', "$tempdir/backup" ], 'pg_basebackup fails because of hba'); +# Some Windows ANSI code pages may reject this filename, in which case we +# quietly proceed without this bit of test coverage. +open BADCHARS, ">>$tempdir/pgdata/FOO\xe0\xe0\xe0BAR"; +print BADCHARS "test backup of file with non-UTF8 name\n"; +close BADCHARS; + open HBA, ">>$tempdir/pgdata/pg_hba.conf"; print HBA "local replication all trust\n"; print HBA "host replication all 127.0.0.1/32 trust\n"; diff --git a/src/port/tar.c b/src/port/tar.c index 4721df3..72fd4e1 100644 --- a/src/port/tar.c +++ b/src/port/tar.c @@ -68,7 +68,7 @@ tarCreateHeader(char *h, const char *filename, const char *linktarget, memset(h, 0, 512); /* assume tar header size */ /* Name 100 */ - sprintf(&h[0], "%.99s", filename); + strlcpy(&h[0], filename, 100); if (linktarget != NULL || S_ISDIR(mode)) { /* @@ -110,7 +110,7 @@ tarCreateHeader(char *h, const char *filename, const char *linktarget, /* Type - Symbolic link */ sprintf(&h[156], "2"); /* Link Name 100 */ - sprintf(&h[157], "%.99s", linktarget); + strlcpy(&h[157], linktarget, 100); } else if (S_ISDIR(mode)) /* Type - directory */ @@ -127,11 +127,11 @@ tarCreateHeader(char *h, const char *filename, const char *linktarget, /* User 32 */ /* XXX: Do we need to care about setting correct username? */ - sprintf(&h[265], "%.31s", "postgres"); + strlcpy(&h[265], "postgres", 32); /* Group 32 */ /* XXX: Do we need to care about setting correct group name? */ - sprintf(&h[297], "%.31s", "postgres"); + strlcpy(&h[297], "postgres", 32); /* Major Dev 8 */ sprintf(&h[329], "%07o ", 0);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers