Re: making the backend's json parser work in frontend code

David Steele Fri, 24 Jan 2020 08:37:28 -0800

On 1/24/20 9:27 AM, Tom Lane wrote:

Peter Eisentraut <[email protected]> writes:

On 2020-01-23 18:04, Robert Haas wrote:

Now, you might say "well, why don't we just do an encoding
conversion?", but we can't. When the filesystem tells us what the file
names are, it does not tell us what encoding the person who created
those files had in mind. We don't know that they had*any*  encoding in
mind. IIUC, a file in the data directory can have a name that consists
of any sequence of bytes whatsoever, so long as it doesn't contain
prohibited characters like a path separator or \0 byte. But only some
of those possible octet sequences can be stored in a manifest that has
to be valid UTF-8.

I think it wouldn't be unreasonable to require that file names in the
database directory be consistently encoded (as defined by pg_control,
probably).  After all, this information is sometimes also shown in
system views, so it's already difficult to process total junk.  In
practice, this shouldn't be an onerous requirement.


I don't entirely follow why we're discussing this at all, if the
requirement is backing up a PG data directory.  There are not, and
are never likely to be, any legitimate files with non-ASCII names
in that context.  Why can't we just skip any such files?

It's not uncommon in my experience for users to drop odd files intoPGDATA (usually versioned copies of postgresql.conf, etc.), but I agreethat it should be discouraged. Even so, I don't recall ever seeing anynon-ASCII filenames.

Skipping files sounds scary, I'd prefer an error or a warning (and thenbase64 encode the filename).


Regards,
--
-David
[email protected]

Re: making the backend's json parser work in frontend code

Reply via email to