Thomas Munro <thomas.mu...@gmail.com> writes: > Problem #1: You can have two databases with different encodings, and > they both pretend that pg_database, pg_authid, pg_db_role_setting etc > are in the local database encoding. That doesn't work too well: > non-ASCII text can be reinterpreted in the wrong encoding.
> There's no problem if you only use one encoding everywhere (probably > UTF8). There's also no problem if you use multiple database > encodings, but put only ASCII in the shared catalogues (because ASCII > is a subset of every supported server encoding). This patch is about > formalising and enforcing those two working arrangements, hopefully > invisibly to most users. There's still an escape hatch mode if you > need it, e.g. for a non-conforming pg_upgrade'd system. Over in the discussion of bug #18735, I've come to the realization that these problems apply equally to the filesystem path names that the server deals with: not only the data directory path, but the path to the installation files [1]. Can we apply the same sort of restrictions to those? I'm envisioning that initdb would check either encoding-validity or all-ASCII-ness of those path names depending on which mode it's setting the server up in. > The patch invents a new setting CLUSTER CATALOG ENCODING, which can be > inspected with SHOW and changed with ALTER SYSTEM. Changing the catalog encoding would also have to re-verify the suitability of the paths. Of course this isn't 100% bulletproof since someone could rename those directories later. But I think that's in "if you break it you get to keep both pieces" territory. regards, tom lane [1] https://www.postgresql.org/message-id/2840430.1733510664%40sss.pgh.pa.us