Hi 2017-12-27 21:38 GMT+01:00 Tom Lane <t...@sss.pgh.pa.us>:
> Attached are patches for two performance-improvement ideas that came > to me while working on > https://www.postgresql.org/message-id/8962.1514399...@sss.pgh.pa.us > The three patches are logically independent and could be committed in > any order. But they touch some overlapping code, so as presented, > you need to apply that other patch first and then these two in > sequence. > please, can you rebase all three patches necessary for patching? > The motivation for the first patch is that I noticed that for simple > plpgsql functions, especially triggers, the per-datum palloc()s performed > by copy_plpgsql_datum() during function entry amounted to a significant > fraction of the total runtime. To fix that, the patch simply does one > palloc for the space needed by all datums, relying on a space calculation > performed at the end of function compilation by plpgsql_finish_datums(). > This does nothing much for trivial functions with only a few datums, but > for ones with more, it's a worthwhile savings. > > BTW, I experimented with a more drastic solution involving separating > the "read only" and "writable" parts of PLpgSQL_datum structs and then > instantiating only the "writable" parts, thus considerably reducing the > amount of data to be copied during per-call initialization. But that > was a lot more invasive to the code, and it seemed to be slower than > what I present here, because performance-critical accesses to variables > had to compute the addresses of both structs associated with the variable. > I've not totally given up hope for that idea, but it'll require more > tuning than I had time for. > > In addition to the core idea of the patch, I noticed that there is no > good reason for PLpgSQL_expr to be treated as a kind of PLpgSQL_datum; > those structs are never members of the estate->datums[] array, nor is > there any code expecting them to be structural supersets of PLpgSQL_datum. > So the patch also removes PLPGSQL_DTYPE_EXPR and the useless fields of > PLpgSQL_expr. > > Also, I changed PLpgSQL_var.isconst and PLpgSQL_var.notnull from "int" > to "bool", which is what they should have been all along, and relocated > them in the PLpgSQL_var struct. There are two motivations for this. > It saves a whole 8 bytes per PLpgSQL_var, at least on 64-bit machines, > because the fields now fit into what had been wasted padding space; > reducing the size of what we have to copy during copy_plpgsql_datums > has to be worth something. Second, those fields are now adjacent to > the common PLpgSQL_variable fields, which will simplify migrating them > into PLpgSQL_variable, as I anticipate we'll want to do at some point > when we allow composite variables to be marked CONSTANT and maybe NOT > NULL. > > > The idea of the second patch came from noticing that in simple trigger > functions, quite a large fraction of cycles went into setting up the > "built in" variables such as tg_name, even though many trigger functions > probably never read most of those variables. We could improve that by > not computing the values until/unless they're read. There are various > names for this technique, but the one that seemed most evocative to me > was to say that these variables have "promises" attached to them. So > that's what the patch calls them. We mark the variables with an enum > indicating which promise needs to be fulfilled for each one, and then > when about to read a datum, we fulfill the promise if needed. > > The method I settled on for that was to invent a separate DTYPE_PROMISE, > which otherwise is treated exactly like DTYPE_VAR, and to code places > like exec_eval_datum() like this: > > switch (datum->dtype) > { > + case PLPGSQL_DTYPE_PROMISE: > + /* fulfill promise if needed, then handle like regular var */ > + plpgsql_fulfill_promise(estate, (PLpgSQL_var *) datum); > + > + /* FALL THRU */ > + > case PLPGSQL_DTYPE_VAR: > { > PLpgSQL_var *var = (PLpgSQL_var *) datum; > > The extra DTYPE is a little bit grotty, but it's not awful. One > alternative I experimented with was to just treat these variables > as plain DTYPE_VAR, requiring coding like > > case PLPGSQL_DTYPE_VAR: > { > PLpgSQL_var *var = (PLpgSQL_var *) datum; > > + if (unlikely(var->promise != PLPGSQL_PROMISE_NONE)) > + plpgsql_fulfill_promise(estate, var); > + > *typeid = var->datatype->typoid; > *typetypmod = var->datatype->atttypmod; > *value = var->value; > > However, this way is injecting an additional test-and-branch into > hot code paths, and it was demonstrably slower. > > With these patches, I see performance improvements of 10% to 20% > on simple but not totally unrealistic triggers, for example > > create or replace function mytrig() returns trigger language plpgsql as > $$ > begin > if (new.f1 != new.f2) or (new.f3 != new.f4) then > new.f3 = 42; > end if; > return new; > end$$ stable; > > (BTW, those are percentages of total INSERT runtime, not just of > the trigger proper; though I cheated to the extent of using a > temp not regular table.) > > It seems possible that the "promise" technique could be useful for > other plpgsql special variables in future. I thought briefly about > applying it to triggers' NEW and OLD arguments, but desisted because > (a) it's only a win if triggers will commonly not touch the variable, > which seems unlikely to be true for NEW/OLD; and (b) it would have > required infrastructure for attaching a promise to a DTYPE_REC > variable, which was more pain than I wanted. But I wonder if it'd > be useful for, say, the special variables that exception blocks create. > > I'll add this to the January commitfest. > All mentioned ideas has sense. Regards Pavel > regards, tom lane > >