On 2020-Jan-24, Tom Lane wrote: > Alvaro Herrera <alvhe...@2ndquadrant.com> writes: > > On 2020-Jan-24, Mark Dilger wrote: > >> I would expect, therefore, that we only back up files which match our > >> expected file name pattern and ignore (perhaps with a warning) > >> everything else. > > > That risks missing files placed in the datadir by extensions; > > I agree that assuming we know everything that will appear in the > data directory is a pretty unsafe assumption. But no rational > extension is going to use a non-ASCII file name, either, if only > because it can't predict what the filesystem encoding will be.
I see two different arguments. One is about the file encoding. Those files are rare and would be placed by the user manually. We can fix that by encoding the name. We can have a debug mode that encodes all names that way, just to ensure the tools are prepared for it. The other is Mark's point about "expected file pattern", which seems a slippery slope to me. If the pattern is /^[a-zA-Z0-9_.]*$/ then I'm okay with it (maybe add a few other punctuation chars); as you say no sane extension would use names much weirder than that. But we should not be stricter, such as counting the number of periods/underscores allowed or where are alpha chars expected (except maybe disallow period at start of filename), or anything too specific like that. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services