Hi, On 2020-08-07 15:16:19 +0900, Michael Paquier wrote: > From cd117fa88938c89ac953a5e3c036828337150b07 Mon Sep 17 00:00:00 2001 > From: Michael Paquier <mich...@paquier.xyz> > Date: Fri, 7 Aug 2020 10:57:40 +0900 > Subject: [PATCH 1/2] Use multi-inserts for pg_depend > > This is a follow-up of the work done in e3931d01. This case is a bit > different than pg_attribute and pg_shdepend: the maximum number of items > to insert is known in advance, but there is no need to handle pinned > dependencies. Hence, the base allocation for slots is done based on the > number of items and the maximum allowed with a cap at 64kB, and items > are initialized once used to minimize the overhead of the operation. > > Author: Daniel Gustafsson, Michael Paquier > Discussion: https://postgr.es/m/XXX > --- > src/backend/catalog/pg_depend.c | 95 ++++++++++++++++++++++++--------- > 1 file changed, 69 insertions(+), 26 deletions(-) > > diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c > index 70baf03178..596f0c5e29 100644 > --- a/src/backend/catalog/pg_depend.c > +++ b/src/backend/catalog/pg_depend.c > @@ -47,6 +47,12 @@ recordDependencyOn(const ObjectAddress *depender, > recordMultipleDependencies(depender, referenced, 1, behavior); > } > > +/* > + * Cap the maximum amount of bytes allocated for recordMultipleDependencies() > + * slots. > + */ > +#define MAX_PGDEPEND_INSERT_BYTES 65535 > +
Do we really want to end up with several separate defines for different type of catalog batch inserts? That doesn't seem like a good thing. Think there should be a single define for all catalog bulk inserts. > /* > * Record multiple dependencies (of the same kind) for a single dependent > * object. This has a little less overhead than recording each separately. > @@ -59,10 +65,10 @@ recordMultipleDependencies(const ObjectAddress *depender, > { > Relation dependDesc; > CatalogIndexState indstate; > - HeapTuple tup; > - int i; > - bool nulls[Natts_pg_depend]; > - Datum values[Natts_pg_depend]; > + int slotCount, i; > + TupleTableSlot **slot; > + int nslots, max_slots; > + bool slot_init = true; > > if (nreferenced <= 0) > return; /* nothing to do */ > @@ -76,11 +82,18 @@ recordMultipleDependencies(const ObjectAddress *depender, > > dependDesc = table_open(DependRelationId, RowExclusiveLock); > > + /* > + * Allocate the slots to use, but delay initialization until we know > that > + * they will be used. > + */ > + max_slots = Min(nreferenced, > + MAX_PGDEPEND_INSERT_BYTES / > sizeof(FormData_pg_depend)); > + slot = palloc(sizeof(TupleTableSlot *) * max_slots); > + > /* Don't open indexes unless we need to make an update */ > indstate = NULL; > > - memset(nulls, false, sizeof(nulls)); > - > + slotCount = 0; > for (i = 0; i < nreferenced; i++, referenced++) > { > /* > @@ -88,38 +101,68 @@ recordMultipleDependencies(const ObjectAddress *depender, > * need to record dependencies on it. This saves lots of space > in > * pg_depend, so it's worth the time taken to check. > */ > - if (!isObjectPinned(referenced, dependDesc)) > + if (isObjectPinned(referenced, dependDesc)) > + continue; > + Hm, would it be better to first iterate over the dependencies, compute the number of dependencies to be inserted, and then go ahead and create the right number of slots? > From fcc0a11e9fc94d2fedc71dd10ba2a23713225963 Mon Sep 17 00:00:00 2001 > From: Michael Paquier <mich...@paquier.xyz> > Date: Fri, 7 Aug 2020 15:14:51 +0900 > Subject: [PATCH 2/2] Switch to multi-insert dependencies for many code paths > > This makes use of the new APIs to insert dependencies in groups, instead > of doing the operation one-by-one. Seems several places have been modified to new APIs despite only covering a single dependency. Perhaps worth mentioning? Greetings, Andres Freund