> On Oct 13, 2021, at 8:43 AM, Robert Haas <robertmh...@gmail.com> wrote:
>
> On Wed, Oct 13, 2021 at 10:40 AM Mark Dilger
> <mark.dil...@enterprisedb.com> wrote:
>> 3a is a bit strange, when considered in the context of patterns. If db1,
>> db2, and db3 all exist and each have a table foo.bar, and psql is connected
>> to db1, how should the command \d db?.foo.bar behave? We have no problem
>> with db1.foo.bar, but we do have problems with the other two. If the answer
>> is to complain about the databases that are unconnected, consider what
>> happens if the user writes this in a script when only db1 exists, and later
>> the script stops working because somebody created database db2. Maybe
>> that's not completely horrible, but surely it is less than ideal.
>>
>> 3b is what pg_amcheck does. It accepts database.schema.relname, and it will
>> complain if no matching database/schema/relation can be found (unless
>> --no-strict-names was given.)
>
> Well, like I said, we can't treat a part that's purportedly a DB name
> as a pattern, so when connected to db1, I presume the command \d
> db?.foo.bar would have to behave just like \d
> dskjlglsghdksgdjkshg.foo.bar. I suppose technically I'm wrong: db?
> could be matched against the list of database names as a pattern, and
> then we could complain only if it doesn't match exactly and only the
> current DB. But I don't like adding a bunch of extra code to
> accomplish nothing useful, so if we're going to match it all I think
> it should just strcmp().
>
> But I'm still not sure what the best thing to do overall is here.
The issue of name parsing impacts pg_dump and pg_dumpall, also. Consider what
happens with:
pg_dump -t production.critical.secrets > secrets.dump
dropdb production
In v13, if your default database is "testing", and database "testing" has the
same schemas and tables (but not data) as production, you are unhappy. You
just dumped a copy of your test data and blew away the production data.
You could end up unhappy in v14, if database "testing" has a schema named
"production" and a table that matches the pattern /^critical.secrets$/, but
otherwise, you'll get an error from pg_dump, "pg_dump: error: no matching
tables were found". Neither behavior seems correct.
The function where the processing occurs is processSQLNamePattern, which is
called by pg_dump, pg_dumpall, and psql. All three callers expect
processSQLNamePattern to append where-clauses to a buffer, not to execute any
sql of its own. I propose that processSQLNamePattern return an error code if
the pattern contains more than three parts, but otherwise insert the database
portion into the buffer as a "pg_catalog.current_database()
OPERATOR(pg_catalog.=) <database>", where <database> is a properly escaped
representation of the database portion. Maybe someday we can change that to
OPERATOR(pg_catalog.~), but for now we lack the sufficient logic for handling
multiple matching database names. (The situation is different for pg_dumpall,
as it's using the normal logic for matching a relation name, not for matching a
database, and we'd still be fine matching that against a pattern.)
For psql and pg_dump, I'm tempted to restrict the database portion (if not
quoted) to neither contain shell glob characters nor POSIX regex characters,
and return an error code if any are found, so that the clients can raise an
appropriate error to the user.
In psql, this proposal would result in no tables matching \d
wrongdb.schema.table, which would differ from v13's behavior. You wouldn't get
an error about having specified the wrong database. You'd just get no matching
relations. \d ??db??.schema.table would complain about the db portion being a
pattern. \d "??db??".schema.table would work, assuming you're connected to a
database literally named ??db??
In pg_dumpall, --exclude-database=more.than.one.part would give an error about
too many dotted parts rather than simply trying to exclude the last "part" and
silently ignoring the prefix, which I think is what v13's pg_dumpall would do.
--exclude-database=db?? would work to exclude four character database names
beginning in "db".
In pg_dump, the -t wrongdb.schema.table would match nothing and give the
familiar error "pg_dump: error: no matching tables were found". pg_dump -t
too.many.dotted.names would give a different error about too many parts.
pg_dump -t db??.foo.bar would give an error about the database needing to be a
literal name rather than a pattern.
I don't like your proposal to use a strcmp() rather than a pg_catalog.= match,
because it diverges from how the rest of the pattern is treated, including in
how encoding settings might interact with the name, needing to be executed on
the client side rather than in the server where the rest of the name resolution
is happening.
Does this sound like a workable proposal?
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company