On 3/20/23 10:43, Tom Lane wrote:
> I'd be more willing to consider the proposed patch if it weren't such
> a hack --- as you say, it doesn't fix the problem when the table has
> policies, so it's hardly a general-purpose solution. I fear that it's
> also fairly expensive: adding sub-selects to the query we must do
> before we can lock any tables is not appetizing, because making that
> window wider adds to the risk of deadlocks, dump failures, etc.
(moving to -hackers and July CF)
= Recap for Potential Reviewers =
The timescaledb extension has an internal table that's owned by the
superuser. It's not dumpable, and other users can only access its
contents through a filtered view. For our cloud deployments, we
additionally have that common trope where the most privileged users
aren't actually superusers, but we still want them to be able to perform
a subset of maintenance tasks, including pg_dumping their data.
When cloud users try to dump that data, pg_dump sees that this internal
table is an extension member and plans to dump ACLs, security labels,
and RLS policies for it. (This behavior cannot be overridden with
--exclude-table. pg_dump ignores that flag for extension members.)
Dumping policies requires the use of pg_get_expr() on the backend; this,
in turn, requires a lock on the table with ACCESS SHARE.
So pg_dump tries to lock a table, with no policies, that it's not going
to dump the schema or data for anyway, and it fails because our users
don't have (and shouldn't need) SELECT access to it. For an example of
this in action, I've attached a test case in v2-0001.
= Proposal =
Since this is affecting users on released Postgres versions, my end goal
is to find a fix that's backportable.
This situation looks very similar to [1], where non-superusers couldn't
perform a dump because we were eagerly grabbing table locks to read
(non-existent) ACLs. But that was solved with the realization that ACLs
don't need locks anyway, which is unfortunately not applicable to policies.
My initial patch to -bugs was a riff on a related performance fix [2],
which figured out which tables had interesting ACLs and skipped that
part if nothing was found. I added the same kind of subselect for RLS
policies as well, but that had nasty corner cases where it would perform
terribly, as Tom alluded to above. (In a cluster of 200k tables, where
one single table had 10M policies, the query ground to a halt.)
So v2-0002 is instead inspired by Tom's rewrite of that ACL dump logic
[3]. It scans pg_policy separately, stores the tables it finds into the
catalog map on the client side, and then again skips the policy dump
(and therefore the lock) if no policies exist. The performance hit now
scales with the size of pg_policy alone.
This is a bit more invasive than the subselect, but hopefully still
straightforward enough to be applicable to the back branches' old
catalog map strategy. It's still not a general-purpose fix, as Tom
pointed out above, but that was true of the discussion in [1] as well,
so I'm optimistic.
WDYT?
--Jacob
[1]
https://postgr.es/m/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn%3DPcQ%40mail.gmail.com
[2] https://git.postgresql.org/cgit/postgresql.git/commit/?id=5d589993
[3] https://git.postgresql.org/cgit/postgresql.git/commit/?id=0c9d8442
From 61a06625a8e6f4937e6311be32edb677d37b6a33 Mon Sep 17 00:00:00 2001
From: Jacob Champion <jchamp...@timescale.com>
Date: Thu, 22 Jun 2023 16:21:41 -0700
Subject: [PATCH v2 1/2] Add failing test for undumped extension table
Currently, SELECT permission is required for extension tables even
if they're internal (i.e. undumpable) and have no RLS policies. Add a
failing test for this situation.
---
src/test/modules/test_pg_dump/t/001_base.pl | 27 +++++++++++++++++++
.../test_pg_dump/test_pg_dump--1.0.sql | 2 ++
2 files changed, 29 insertions(+)
diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl
index d00c3544e9..68a767d2f5 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -175,6 +175,19 @@ my %pgdump_runs = (
'postgres',
],
},
+
+ # regress_dump_login_role shouldn't need SELECT rights on internal
+ # (undumped) extension tables
+ privileged_internals => {
+ dump_cmd => [
+ 'pg_dump', '--no-sync', "--file=$tempdir/privileged_internals.sql",
+ # these two tables are irrelevant to the test case
+ '--exclude-table=regress_pg_dump_schema.external_tab',
+ '--exclude-table=regress_pg_dump_schema.extdependtab',
+ '--username=regress_dump_login_role', 'postgres',
+ ],
+ },
+
schema_only => {
dump_cmd => [
'pg_dump', '--no-sync', "--file=$tempdir/schema_only.sql",
@@ -284,6 +297,7 @@ my %full_runs = (
exclude_table => 1,
no_privs => 1,
no_owner => 1,
+ privileged_internals => 1,
with_extension => 1,
without_extension => 1);
@@ -321,6 +335,16 @@ my %tests = (
like => { pg_dumpall_globals => 1, },
},
+ 'CREATE ROLE regress_dump_login_role' => {
+ create_order => 1,
+ create_sql => 'CREATE ROLE regress_dump_login_role LOGIN;',
+ regexp => qr/^
+ \QCREATE ROLE regress_dump_login_role;\E
+ \n\QALTER ROLE regress_dump_login_role WITH \E.*\Q LOGIN \E.*;
+ \n/xm,
+ like => { pg_dumpall_globals => 1, },
+ },
+
'GRANT ALTER SYSTEM ON PARAMETER full_page_writes TO regress_dump_test_role'
=> {
create_order => 2,
@@ -704,6 +728,7 @@ my %tests = (
data_only => 1,
extension_schema => 1,
pg_dumpall_globals => 1,
+ privileged_internals => 1,
section_data => 1,
section_pre_data => 1,
# Excludes this schema as extension is not listed.
@@ -720,6 +745,7 @@ my %tests = (
data_only => 1,
extension_schema => 1,
pg_dumpall_globals => 1,
+ privileged_internals => 1,
section_data => 1,
section_pre_data => 1,
# Excludes this schema as extension is not listed.
@@ -743,6 +769,7 @@ my %tests = (
# Excludes the extension and keeps the schema's data.
without_extension_internal_schema => 1,
},
+ unlike => { privileged_internals => 1 },
},);
#########################################
diff --git a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
index 110f7eef66..1c68e146d9 100644
--- a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
+++ b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
@@ -12,11 +12,13 @@ CREATE SEQUENCE regress_pg_dump_seq;
CREATE SEQUENCE regress_seq_dumpable;
SELECT pg_catalog.pg_extension_config_dump('regress_seq_dumpable', '');
+GRANT SELECT ON SEQUENCE regress_seq_dumpable TO public;
CREATE TABLE regress_table_dumpable (
col1 int check (col1 > 0)
);
SELECT pg_catalog.pg_extension_config_dump('regress_table_dumpable', '');
+GRANT SELECT ON regress_table_dumpable TO public;
CREATE SCHEMA regress_pg_dump_schema;
--
2.25.1
From a70aba2d0f331042010f7e33395760375968ea17 Mon Sep 17 00:00:00 2001
From: Jacob Champion <jchamp...@timescale.com>
Date: Thu, 16 Mar 2023 11:46:08 -0700
Subject: [PATCH v2 2/2] pg_dump: skip lock for extension tables without
policies
If a user without SELECT permissions on an internal extension table
tries to dump the extension, the dump will fail while trying to lock the
table with ACCESS SHARE, even though the user doesn't want or need to
dump the table in question. (The lock is taken to allow later
pg_get_expr() calls on pg_policy to remain consistent in the face of
concurrent schema changes.)
It'd be ideal not to require SELECT permissions on a table to be able to
dump its policies, but I don't have a great idea for how to implement
that without races. As a workaround, skip the policy queries entirely if
we can determine that no policies exist for a table at the time of
getTables().
Fixes the previous commit's failing test.
---
src/bin/pg_dump/common.c | 54 ++++++++++++++++++++++++++++++++++++
src/bin/pg_dump/pg_dump.c | 58 +++++++++++++++++++++++++++++++++++++++
src/bin/pg_dump/pg_dump.h | 4 +++
3 files changed, 116 insertions(+)
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 5d988986ed..69df3567f9 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -59,6 +59,7 @@ typedef struct _catalogIdMapEntry
uint32 hashval; /* hash code for the CatalogId */
DumpableObject *dobj; /* the associated DumpableObject, if any */
ExtensionInfo *ext; /* owning extension, if any */
+ bool has_policies; /* referenced by pg_policy? */
} CatalogIdMapEntry;
#define SH_PREFIX catalogid
@@ -135,6 +136,13 @@ getSchemaData(Archive *fout, int *numTablesPtr)
pg_log_info("identifying extension members");
getExtensionMembership(fout, extinfo, numExtensions);
+ /*
+ * Similarly, the existence of RLS policies influences whether some tables
+ * need to be locked.
+ */
+ pg_log_info("checking for row-level security policies");
+ getTablesWithPolicies(fout);
+
pg_log_info("reading schemas");
(void) getNamespaces(fout, &numNamespaces);
@@ -686,6 +694,7 @@ AssignDumpId(DumpableObject *dobj)
{
entry->dobj = NULL;
entry->ext = NULL;
+ entry->has_policies = false;
}
Assert(entry->dobj == NULL);
entry->dobj = dobj;
@@ -995,6 +1004,7 @@ recordExtensionMembership(CatalogId catId, ExtensionInfo *ext)
{
entry->dobj = NULL;
entry->ext = NULL;
+ entry->has_policies = false;
}
Assert(entry->ext == NULL);
entry->ext = ext;
@@ -1019,6 +1029,50 @@ findOwningExtension(CatalogId catalogId)
}
+/*
+ * recordPoliciesExist
+ * Record that the object identified by the given catalog ID has RLS policies
+ */
+void
+recordPoliciesExist(CatalogId catId)
+{
+ CatalogIdMapEntry *entry;
+ bool found;
+
+ /* Initialize CatalogId hash table if not done yet */
+ if (catalogIdHash == NULL)
+ catalogIdHash = catalogid_create(CATALOGIDHASH_INITIAL_SIZE, NULL);
+
+ /* Add reference to CatalogId hash */
+ entry = catalogid_insert(catalogIdHash, catId, &found);
+ if (!found)
+ {
+ entry->dobj = NULL;
+ entry->ext = NULL;
+ entry->has_policies = false;
+ }
+ entry->has_policies = true;
+}
+
+/*
+ * hasPolicies
+ * return whether the specified catalog ID has RLS policies
+ */
+bool
+hasPolicies(CatalogId catId)
+{
+ CatalogIdMapEntry *entry;
+
+ if (catalogIdHash == NULL)
+ return false; /* no objects exist yet */
+
+ entry = catalogid_lookup(catalogIdHash, catId);
+ if (entry == NULL)
+ return false;
+ return entry->has_policies;
+}
+
+
/*
* parseOidArray
* parse a string of numbers delimited by spaces into a character array
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5dab1ba9ea..2f3c20a584 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3742,6 +3742,53 @@ dumpLOs(Archive *fout, const void *arg)
return 1;
}
+/*
+ * getTablesWithPolicies TODO
+ */
+void
+getTablesWithPolicies(Archive *fout)
+{
+ PQExpBuffer query;
+ PGresult *res;
+ int i_classid;
+ int i_polrelid;
+ int i,
+ ntups;
+
+ /* No policies before 9.5 */
+ if (fout->remoteVersion < 90500)
+ return;
+
+ query = createPQExpBuffer();
+
+ /* Figure out which tables have RLS policies. */
+ printfPQExpBuffer(query,
+ "SELECT DISTINCT 'pg_class'::regclass::oid AS classid, "
+ " polrelid "
+ "FROM pg_catalog.pg_policy");
+
+ res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+
+ ntups = PQntuples(res);
+
+ i_classid = PQfnumber(res, "classid");
+ i_polrelid = PQfnumber(res, "polrelid");
+
+ for (i = 0; i < ntups; i++)
+ {
+ CatalogId objId;
+
+ objId.tableoid = atooid(PQgetvalue(res, i, i_classid));
+ objId.oid = atooid(PQgetvalue(res, i, i_polrelid));
+
+ recordPoliciesExist(objId);
+ }
+
+ PQclear(res);
+
+ destroyPQExpBuffer(query);
+}
+
/*
* getPolicies
* get information about all RLS policies on dumpable tables.
@@ -6658,6 +6705,17 @@ getTables(Archive *fout, int *numTables)
else
selectDumpableTable(&tblinfo[i], fout);
+ /*
+ * If the table has no policies, we don't need to worry about those.
+ *
+ * For tables internal to an extension, this may mean we don't need to
+ * take an ACCESS SHARE lock, which in turn allows less privileged users
+ * to successfully perform a dump if they don't have SELECT access to
+ * those tables (which they weren't trying to dump in the first place).
+ */
+ if (!hasPolicies(tblinfo[i].dobj.catId))
+ tblinfo[i].dobj.dump &= ~DUMP_COMPONENT_POLICY;
+
/*
* Now, consider the table "interesting" if we need to dump its
* definition or its data. Later on, we'll skip a lot of data
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index bc8f2ec36d..5dea0b63d6 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -695,6 +695,9 @@ extern PublicationInfo *findPublicationByOid(Oid oid);
extern void recordExtensionMembership(CatalogId catId, ExtensionInfo *ext);
extern ExtensionInfo *findOwningExtension(CatalogId catalogId);
+extern void recordPoliciesExist(CatalogId catId);
+extern bool hasPolicies(CatalogId catId);
+
extern void parseOidArray(const char *str, Oid *array, int arraysize);
extern void sortDumpableObjects(DumpableObject **objs, int numObjs,
@@ -743,6 +746,7 @@ extern void getExtensionMembership(Archive *fout, ExtensionInfo extinfo[],
extern void processExtensionTables(Archive *fout, ExtensionInfo extinfo[],
int numExtensions);
extern EventTriggerInfo *getEventTriggers(Archive *fout, int *numEventTriggers);
+extern void getTablesWithPolicies(Archive *fout);
extern void getPolicies(Archive *fout, TableInfo tblinfo[], int numTables);
extern PublicationInfo *getPublications(Archive *fout,
int *numPublications);
--
2.25.1