On 2019-11-12 16:25:06 +0100, Daniel Gustafsson wrote:
> > On 11 Nov 2019, at 09:32, Michael Paquier <mich...@paquier.xyz> wrote:
> 
> > This part has resulted in 75c1921, and we could just change
> > DecodeMultiInsert() so as if there is no tuple data then we'd just
> > leave.  However, I don't feel completely comfortable with that either
> > as it would be nice to still check that for normal relations we
> > *always* have a FPW available.
> 
> XLH_INSERT_CONTAINS_NEW_TUPLE will only be set in case of catalog relations
> IIUC (that is, non logically decoded relations), so it seems to me that we can
> have a fastpath out of DecodeMultiInsert() which inspects that flag without
> problems. Is this proposal along the lines of what you were thinking?

Maybe I'm missing something, but it's the opposite, no?

        bool            need_tuple_data = RelationIsLogicallyLogged(relation);

...
                        if (need_tuple_data)
                                xlrec->flags |= XLH_INSERT_CONTAINS_NEW_TUPLE;

or am I misunderstanding what you mean?


> @@ -1600,10 +1598,16 @@ recordDependencyOnExpr(const ObjectAddress *depender,
>       /* Remove any duplicates */
>       eliminate_duplicate_dependencies(context.addrs);
>  
> -     /* And record 'em */
> -     recordMultipleDependencies(depender,
> -                                                        context.addrs->refs, 
> context.addrs->numrefs,
> -                                                        behavior);
> +     /*
> +      * And record 'em. If we know we only have a single dependency then
> +      * avoid the extra cost of setting up a multi insert.
> +      */
> +     if (context.addrs->numrefs == 1)
> +             recordDependencyOn(depender, &context.addrs->refs[0], behavior);
> +     else
> +             recordMultipleDependencies(depender,
> +                                                                
> context.addrs->refs, context.addrs->numrefs,
> +                                                                behavior);

I'm not sure this is actually a worthwhile complexity to keep. Hard to
believe that setting up a multi-insert is goign to have a significant
enough overhead to matter here?

And if it does, is there a chance we can hide this repeated block
somewhere within recordMultipleDependencies() or such? I don't like the
repetitiveness here. Seems recordMultipleDependencies() could easily
optimize the case of there being exactly one dependency to insert?


> +/*
> + * InsertPgAttributeTuples
> + *           Construct and insert multiple tuples in pg_attribute.
> + *
> + * This is a variant of InsertPgAttributeTuple() which dynamically allocates
> + * space for multiple tuples. Having two so similar functions is a kludge, 
> but
> + * for now it's a TODO to make it less terrible.
> + */
> +void
> +InsertPgAttributeTuples(Relation pg_attribute_rel,
> +                                             FormData_pg_attribute 
> *new_attributes,
> +                                             int natts,
> +                                             CatalogIndexState indstate)
> +{
> +     Datum           values[Natts_pg_attribute];
> +     bool            nulls[Natts_pg_attribute];
> +     HeapTuple       tup;
> +     int                     i;
> +     TupleTableSlot **slot;
> +
> +     /*
> +      * The slots are dropped in CatalogMultiInsertWithInfo(). TODO: natts
> +      * number of slots is not a reasonable assumption as a wide relation
> +      * would be detrimental, figuring a good number is left as a TODO.
> +      */
> +     slot = palloc(sizeof(TupleTableSlot *) * natts);

Hm. Looking at

SELECT avg(pg_column_size(pa)) FROM pg_attribute pa;

yielding ~144 bytes, we can probably cap this at 128 or such, without
loosing efficiency. Or just use
#define MAX_BUFFERED_BYTES              65535
from copy.c or such (MAX_BUFFERED_BYTES / sizeof(FormData_pg_attribute)).


> +     /* This is a tad tedious, but way cleaner than what we used to do... */

I don't like comments that refer to "what we used to" because there's no
way for anybody to make sense of that, because it's basically a dangling
reference :)


> +     memset(values, 0, sizeof(values));
> +     memset(nulls, false, sizeof(nulls));

> +     /* start out with empty permissions and empty options */
> +     nulls[Anum_pg_attribute_attacl - 1] = true;
> +     nulls[Anum_pg_attribute_attoptions - 1] = true;
> +     nulls[Anum_pg_attribute_attfdwoptions - 1] = true;
> +     nulls[Anum_pg_attribute_attmissingval - 1] = true;
> +
> +     /* attcacheoff is always -1 in storage */
> +     values[Anum_pg_attribute_attcacheoff - 1] = Int32GetDatum(-1);
> +
> +     for (i = 0; i < natts; i++)
> +     {
> +             values[Anum_pg_attribute_attrelid - 1] = 
> ObjectIdGetDatum(new_attributes[i].attrelid);
> +             values[Anum_pg_attribute_attname - 1] = 
> NameGetDatum(&new_attributes[i].attname);
> +             values[Anum_pg_attribute_atttypid - 1] = 
> ObjectIdGetDatum(new_attributes[i].atttypid);
> +             values[Anum_pg_attribute_attstattarget - 1] = 
> Int32GetDatum(new_attributes[i].attstattarget);
> +             values[Anum_pg_attribute_attlen - 1] = 
> Int16GetDatum(new_attributes[i].attlen);
> +             values[Anum_pg_attribute_attnum - 1] = 
> Int16GetDatum(new_attributes[i].attnum);
> +             values[Anum_pg_attribute_attndims - 1] = 
> Int32GetDatum(new_attributes[i].attndims);
> +             values[Anum_pg_attribute_atttypmod - 1] = 
> Int32GetDatum(new_attributes[i].atttypmod);
> +             values[Anum_pg_attribute_attbyval - 1] = 
> BoolGetDatum(new_attributes[i].attbyval);
> +             values[Anum_pg_attribute_attstorage - 1] = 
> CharGetDatum(new_attributes[i].attstorage);
> +             values[Anum_pg_attribute_attalign - 1] = 
> CharGetDatum(new_attributes[i].attalign);
> +             values[Anum_pg_attribute_attnotnull - 1] = 
> BoolGetDatum(new_attributes[i].attnotnull);
> +             values[Anum_pg_attribute_atthasdef - 1] = 
> BoolGetDatum(new_attributes[i].atthasdef);
> +             values[Anum_pg_attribute_atthasmissing - 1] = 
> BoolGetDatum(new_attributes[i].atthasmissing);
> +             values[Anum_pg_attribute_attidentity - 1] = 
> CharGetDatum(new_attributes[i].attidentity);
> +             values[Anum_pg_attribute_attgenerated - 1] = 
> CharGetDatum(new_attributes[i].attgenerated);
> +             values[Anum_pg_attribute_attisdropped - 1] = 
> BoolGetDatum(new_attributes[i].attisdropped);
> +             values[Anum_pg_attribute_attislocal - 1] = 
> BoolGetDatum(new_attributes[i].attislocal);
> +             values[Anum_pg_attribute_attinhcount - 1] = 
> Int32GetDatum(new_attributes[i].attinhcount);
> +             values[Anum_pg_attribute_attcollation - 1] = 
> ObjectIdGetDatum(new_attributes[i].attcollation);
> +
> +             slot[i] = 
> MakeSingleTupleTableSlot(RelationGetDescr(pg_attribute_rel),
> +                                                                             
>    &TTSOpsHeapTuple);
> +             tup = heap_form_tuple(RelationGetDescr(pg_attribute_rel), 
> values, nulls);
> +             ExecStoreHeapTuple(heap_copytuple(tup), slot[i], false);

This seems likely to waste some effort - during insertion the slot will
be materialized, which'll copy the tuple. I think it'd be better to
construct the tuple directly inside the slot's tts_values/isnull, and
then store it with ExecStoreVirtualTuple().

Also, why aren't you actually just specifying shouldFree = true? We'd
want the result of the heap_form_tuple() to be freed eagerly, no? Nor do
I get why you're *also* heap_copytuple'ing the tuple, despite having
just locally formed it, and not referencing the result of
heap_form_tuple() further?  Obviously that's all unimportant if you
change the code to use ExecStoreVirtualTuple()?


> +     }
> +
> +     /* finally insert the new tuples, update the indexes, and clean up */
> +     CatalogMultiInsertWithInfo(pg_attribute_rel, slot, natts, indstate);

Previous comment:

I think it might be worthwhile to clear and delete the slots
after inserting. That's potentially a good bit of memory, no?

Current comment:

I think I quite dislike the API of CatalogMultiInsertWithInfo freeing
the slots. It'd e.g. make it impossible to reuse them to insert more
data. It's also really hard to understand


> +}
> +
>  /*
>   * InsertPgAttributeTuple
>   *           Construct and insert a new tuple in pg_attribute.
> @@ -754,7 +827,7 @@ AddNewAttributeTuples(Oid new_rel_oid,
>                                         TupleDesc tupdesc,
>                                         char relkind)
>  {
> -     Form_pg_attribute attr;
> +     Form_pg_attribute *attrs;
>       int                     i;
>       Relation        rel;
>       CatalogIndexState indstate;
> @@ -769,35 +842,42 @@ AddNewAttributeTuples(Oid new_rel_oid,
>  
>       indstate = CatalogOpenIndexes(rel);
>  
> +     attrs = palloc(sizeof(Form_pg_attribute) * natts);

Hm. Why we we need this separate allocation? Isn't this exactly the same
as what's in the tupledesc?


> +/*
> + * CatalogMultiInsertWithInfo

Hm. The current function is called CatalogTupleInsert(), wouldn't
CatalogTuplesInsertWithInfo() or such be more accurate? Or
CatalogTuplesMultiInsertWithInfo()?



> @@ -81,7 +128,14 @@ recordMultipleDependencies(const ObjectAddress *depender,
>  
>       memset(nulls, false, sizeof(nulls));
>  
> -     for (i = 0; i < nreferenced; i++, referenced++)
> +     values[Anum_pg_depend_classid - 1] = 
> ObjectIdGetDatum(depender->classId);
> +     values[Anum_pg_depend_objid - 1] = ObjectIdGetDatum(depender->objectId);
> +     values[Anum_pg_depend_objsubid - 1] = 
> Int32GetDatum(depender->objectSubId);
> +
> +     /* TODO is nreferenced a reasonable allocation of slots? */
> +     slot = palloc(sizeof(TupleTableSlot *) * nreferenced);
> +
> +     for (i = 0, ntuples = 0; i < nreferenced; i++, referenced++)
>       {
>               /*
>                * If the referenced object is pinned by the system, there's no 
> real
> @@ -94,30 +148,34 @@ recordMultipleDependencies(const ObjectAddress *depender,
>                        * Record the Dependency.  Note we don't bother to 
> check for
>                        * duplicate dependencies; there's no harm in them.
>                        */
> -                     values[Anum_pg_depend_classid - 1] = 
> ObjectIdGetDatum(depender->classId);
> -                     values[Anum_pg_depend_objid - 1] = 
> ObjectIdGetDatum(depender->objectId);
> -                     values[Anum_pg_depend_objsubid - 1] = 
> Int32GetDatum(depender->objectSubId);
> -
>                       values[Anum_pg_depend_refclassid - 1] = 
> ObjectIdGetDatum(referenced->classId);
>                       values[Anum_pg_depend_refobjid - 1] = 
> ObjectIdGetDatum(referenced->objectId);
>                       values[Anum_pg_depend_refobjsubid - 1] = 
> Int32GetDatum(referenced->objectSubId);
>  
>                       values[Anum_pg_depend_deptype - 1] = 
> CharGetDatum((char) behavior);
>  
> -                     tup = heap_form_tuple(dependDesc->rd_att, values, 
> nulls);
> +                     slot[ntuples] = 
> MakeSingleTupleTableSlot(RelationGetDescr(dependDesc),
> +                                                                             
>                          &TTSOpsHeapTuple);
> +
> +                     tuple = heap_form_tuple(dependDesc->rd_att, values, 
> nulls);
> +                     ExecStoreHeapTuple(heap_copytuple(tuple), 
> slot[ntuples], false);
> +                     ntuples++;

Same concern as in the equivalent pg_attribute code.


> +     ntuples = 0;
>       while (HeapTupleIsValid(tup = systable_getnext(scan)))
>       {
> -             HeapTuple       newtup;
> -
> -             newtup = heap_modify_tuple(tup, sdepDesc, values, nulls, 
> replace);
> -             CatalogTupleInsertWithInfo(sdepRel, newtup, indstate);
> +             slot[ntuples] = 
> MakeSingleTupleTableSlot(RelationGetDescr(sdepRel),
> +                                                                     
> &TTSOpsHeapTuple);
> +             newtuple = heap_modify_tuple(tup, sdepDesc, values, nulls, 
> replace);
> +             ExecStoreHeapTuple(heap_copytuple(newtuple), slot[ntuples], 
> false);
> +             ntuples++;


> -             heap_freetuple(newtup);
> +             if (ntuples == DEPEND_TUPLE_BUF)
> +             {
> +                     CatalogMultiInsertWithInfo(sdepRel, slot, ntuples, 
> indstate);
> +                     ntuples = 0;
> +             }
>       }

Too much copying again.


Greetings,

Andres Freund


Reply via email to