On Wed, Dec 25, 2024 at 06:55:51PM +0300, Vladlen Popolitov wrote: > This UTF-8 feature leads to annoying test failure > (010_dump_connstr).
It's not merely an annoying test failure. On Windows configured with a multibyte system locale, anyone with CREATEDB privilege can name a database such that pg_dumpall can't restore it. > Option 1 > Skip this test for Windows in UTF-8 mode. > > Option 2. > Exclude all 8-bit characters for Windows in UTF-8 mode. Now only " excluded > for Windows. > > Option 3. > Test with some limited list of correct UTF-8 symbols - just in case, that > they also works. > It could be 64 2-bytes UTF-8 characters. Those are ways to suppress the test failure. But we have that test because pg_dumpall and pg_upgrade rely on the ability to send all possible rolname and datname on the command line. In a cluster that uses a single-byte encoding, that requires the ability to pass every sequence of bytes [0x01,0xFF]. It's not much of a win to make the test stop failing if real use of pg_dump and pg_upgrade would still fail. Message postgr.es/m/20241215023221.4d.nmi...@google.com (original post of this thread) gave PGSERVICEFILE as a way to make the real usage work. That works by removing the requirement to pass arbitrary bytes in command lines. The command line would contain an ASCII-only service name, and the arbitrary bytes would appear inside the service file. Another way might be to create the objects with placeholder ASCII names. As the last step of the restore, rename the placeholder ASCII names to the source cluster's names. Once we can assume Windows 11 or later, another way is <activeCodePage>en-US</activeCodePage> in a fusion manifest, per https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#activeCodePage. Any single-byte encoding choice might suffice. That makes PostgreSQL independent of the system locale.