> On Oct 13, 2021, at 8:43 AM, Robert Haas <robertmh...@gmail.com> wrote:
> 
> On Wed, Oct 13, 2021 at 10:40 AM Mark Dilger
> <mark.dil...@enterprisedb.com> wrote:
>> 3a is a bit strange, when considered in the context of patterns.  If db1, 
>> db2, and db3 all exist and each have a table foo.bar, and psql is connected 
>> to db1, how should the command \d db?.foo.bar behave?  We have no problem 
>> with db1.foo.bar, but we do have problems with the other two.  If the answer 
>> is to complain about the databases that are unconnected, consider what 
>> happens if the user writes this in a script when only db1 exists, and later 
>> the script stops working because somebody created database db2.  Maybe 
>> that's not completely horrible, but surely it is less than ideal.
>> 
>> 3b is what pg_amcheck does.  It accepts database.schema.relname, and it will 
>> complain if no matching database/schema/relation can be found (unless 
>> --no-strict-names was given.)
> 
> Well, like I said, we can't treat a part that's purportedly a DB name
> as a pattern, so when connected to db1, I presume the command \d
> db?.foo.bar would have to behave just like \d
> dskjlglsghdksgdjkshg.foo.bar. I suppose technically I'm wrong: db?
> could be matched against the list of database names as a pattern, and
> then we could complain only if it doesn't match exactly and only the
> current DB. But I don't like adding a bunch of extra code to
> accomplish nothing useful, so if we're going to match it all I think
> it should just strcmp().
> 
> But I'm still not sure what the best thing to do overall is here.

The issue of name parsing impacts pg_dump and pg_dumpall, also.  Consider what 
happens with:

pg_dump -t production.critical.secrets > secrets.dump
dropdb production

In v13, if your default database is "testing", and database "testing" has the 
same schemas and tables (but not data) as production, you are unhappy.  You 
just dumped a copy of your test data and blew away the production data.

You could end up unhappy in v14, if database "testing" has a schema named 
"production" and a table that matches the pattern /^critical.secrets$/, but 
otherwise, you'll get an error from pg_dump, "pg_dump: error: no matching 
tables were found".  Neither behavior seems correct.

The function where the processing occurs is processSQLNamePattern, which is 
called by pg_dump, pg_dumpall, and psql.  All three callers expect 
processSQLNamePattern to append where-clauses to a buffer, not to execute any 
sql of its own.  I propose that processSQLNamePattern return an error code if 
the pattern contains more than three parts, but otherwise insert the database 
portion into the buffer as a "pg_catalog.current_database() 
OPERATOR(pg_catalog.=) <database>", where <database> is a properly escaped 
representation of the database portion.  Maybe someday we can change that to 
OPERATOR(pg_catalog.~), but for now we lack the sufficient logic for handling 
multiple matching database names.  (The situation is different for pg_dumpall, 
as it's using the normal logic for matching a relation name, not for matching a 
database, and we'd still be fine matching that against a pattern.)

For psql and pg_dump, I'm tempted to restrict the database portion (if not 
quoted) to neither contain shell glob characters nor POSIX regex characters, 
and return an error code if any are found, so that the clients can raise an 
appropriate error to the user.

In psql, this proposal would result in no tables matching \d 
wrongdb.schema.table, which would differ from v13's behavior.  You wouldn't get 
an error about having specified the wrong database.  You'd just get no matching 
relations.  \d ??db??.schema.table would complain about the db portion being a 
pattern.  \d "??db??".schema.table would work, assuming you're connected to a 
database literally named ??db??

In pg_dumpall, --exclude-database=more.than.one.part would give an error about 
too many dotted parts rather than simply trying to exclude the last "part" and 
silently ignoring the prefix, which I think is what v13's pg_dumpall would do.  
--exclude-database=db?? would work to exclude four character database names 
beginning in "db".

In pg_dump, the -t wrongdb.schema.table would match nothing and give the 
familiar error "pg_dump: error: no matching tables were found".  pg_dump -t 
too.many.dotted.names would give a different error about too many parts.  
pg_dump -t db??.foo.bar would give an error about the database needing to be a 
literal name rather than a pattern.

I don't like your proposal to use a strcmp() rather than a pg_catalog.= match, 
because it diverges from how the rest of the pattern is treated, including in 
how encoding settings might interact with the name, needing to be executed on 
the client side rather than in the server where the rest of the name resolution 
is happening. 

Does this sound like a workable proposal?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to