I looked into this complaint: https://www.postgresql.org/message-id/1477487162344-5927751.p...@n3.nabble.com which boils down to array_append being a lot slower in 9.5 & up than it was before. Now, using array_append as an aggregate transfn has never been a very good idea, because it's inherently O(N^2) in the number of input rows. But the constant factor is something like 15x worse as of 9.5. For example you can try this simple case:
create aggregate adims(anyelement) ( sfunc = array_append, stype = anyarray, finalfunc = array_dims ); In 9.4 I get times like this: regression=# select adims(x) from generate_series(1,1000) x; Time: 3.176 ms regression=# select adims(x) from generate_series(1,10000) x; Time: 62.792 ms regression=# select adims(x) from generate_series(1,100000) x; Time: 7318.999 ms but 9.5 does this: regression=# select adims(x) from generate_series(1,1000) x; Time: 17.532 ms regression=# select adims(x) from generate_series(1,10000) x; Time: 1146.686 ms regression=# select adims(x) from generate_series(1,100000) x; Time: 113892.993 ms For comparison, the nominally equivalent array_agg() is much faster, giving roughly linear performance in either 9.4 or 9.5: regression=# select array_dims(array_agg(x)) from generate_series(1,1000) x; Time: 2.197 ms regression=# select array_dims(array_agg(x)) from generate_series(1,10000) x; Time: 6.353 ms regression=# select array_dims(array_agg(x)) from generate_series(1,100000) x; Time: 53.198 ms The reason for the slowdown is that in 9.5, array_append prefers to work on "expanded arrays", and we're converting back and forth between expanded and flat array datums at each row of the aggregation. Ick. We knew that the expanded-array patch would be giving up performance in some cases to win it in others, but it'd be nice if it weren't giving up this much in a case that users actually care about. If we could tell people not to use array_append as a transition function, then maybe we would not have to solve this, but array_agg_transfn isn't a very plausible solution for user-defined aggregates because of its non-type-safe use of an "internal"-type transvalue. That means you need superuser privileges to create the aggregate, and you can't code your final-function in a PL language. I thought about various ways that we might fix this, but they all have disadvantages of one sort or another: 1. We could add a code path in array_append that does it the old way if it's being called as an aggregate function. This would get us back to 9.4-ish performance for this case ... not that that's very good. 2. We could teach nodeAgg.c to deal with expanded-object datums explicitly, in more or less the same way that plpgsql deals with expanded values in plpgsql variables. I made a draft patch for this (attached), and find that it puts this user-defined aggregate on par with array_agg speed-wise. But I'm not too happy with it because it adds some cycles to advance_transition_function() whether or not we're actually dealing with an expanded object. We already know that adding even one instruction there can make for measurable slowdown. (The draft patch leaves some things on the table there, in particular we could arrange not to call MakeExpandedObjectReadOnly if the transvalue isn't of varlena type; but there's still going to be some added overhead.) Another problem with this approach is that in hashed aggregation, an expanded object per group might eat more storage than you'd like. We had to teach array_agg_transfn not to use a separate memory context per aggregation value, and that issue would come up here too. Lastly, the draft patch hard-wires the optimization to be available only to array_append and array_prepend. That's the same as what we did in plpgsql, but it wasn't very nice there and it's not nice here either. 3. We could try to fix it mostly from array_append's side, by teaching it to return the expanded array in the aggregation context when it's being called as an aggregate function, and making some hopefully-not-performance-killing tweaks to nodeAgg to play along with that. This would amount to additional complication in the API contract for C-coded aggregate functions, but I think it wouldn't affect functions that weren't attempting to invoke the optimization, so it should be OK. A larger objection is that it still doesn't do anything for the memory consumption angle. I have some other ideas about things we could do going forward, but they don't seem likely to lead to back-patchable fixes. The memory consumption problem could be dealt with if we allowed expanded objects to sometimes not have their own context, but that would involve API changes for expanded objects. And I'm wondering whether, if we did fix all this, we could get rid of ArrayBuildState entirely in favor of making the functions that use that build an expanded array directly. And maybe we could get rid of array_agg_transfn in favor of using array_append, eliminating the need for a non-type-safe solution there. Comments, better ideas? regards, tom lane PS: note this draft patch doesn't touch nodeWindowAgg.c, but that would presumably need changes parallel to whatever we do in nodeAgg.c.
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index b06e1c1..8c74781 100644 *** a/src/backend/executor/nodeAgg.c --- b/src/backend/executor/nodeAgg.c *************** *** 164,169 **** --- 164,170 ---- #include "parser/parse_coerce.h" #include "utils/acl.h" #include "utils/builtins.h" + #include "utils/fmgroids.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/syscache.h" *************** typedef struct AggStatePerTransData *** 288,293 **** --- 289,300 ---- transtypeByVal; /* + * This flag says whether we trust the transfn with a read-write pointer + * to a transition value that is an expanded object. + */ + bool transfnAllowRW; + + /* * Stuff for evaluation of inputs. We used to just use ExecEvalExpr, but * with the addition of ORDER BY we now need at least a slot for passing * data to the sort object, which requires a tupledesc, so we might as *************** advance_transition_function(AggState *ag *** 752,760 **** */ oldContext = MemoryContextSwitchTo( aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! pergroupstate->transValue = datumCopy(fcinfo->arg[1], ! pertrans->transtypeByVal, ! pertrans->transtypeLen); pergroupstate->transValueIsNull = false; pergroupstate->noTransValue = false; MemoryContextSwitchTo(oldContext); --- 759,767 ---- */ oldContext = MemoryContextSwitchTo( aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! pergroupstate->transValue = datumTransfer(fcinfo->arg[1], ! pertrans->transtypeByVal, ! pertrans->transtypeLen); pergroupstate->transValueIsNull = false; pergroupstate->noTransValue = false; MemoryContextSwitchTo(oldContext); *************** advance_transition_function(AggState *ag *** 781,787 **** /* * OK to call the transition function */ ! fcinfo->arg[0] = pergroupstate->transValue; fcinfo->argnull[0] = pergroupstate->transValueIsNull; fcinfo->isnull = false; /* just in case transfn doesn't set it */ --- 788,799 ---- /* * OK to call the transition function */ ! if (pertrans->transfnAllowRW) ! fcinfo->arg[0] = pergroupstate->transValue; ! else ! fcinfo->arg[0] = MakeExpandedObjectReadOnly(pergroupstate->transValue, ! pergroupstate->transValueIsNull, ! pertrans->transtypeLen); fcinfo->argnull[0] = pergroupstate->transValueIsNull; fcinfo->isnull = false; /* just in case transfn doesn't set it */ *************** advance_transition_function(AggState *ag *** 800,811 **** if (!fcinfo->isnull) { MemoryContextSwitchTo(aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! newVal = datumCopy(newVal, ! pertrans->transtypeByVal, ! pertrans->transtypeLen); } if (!pergroupstate->transValueIsNull) ! pfree(DatumGetPointer(pergroupstate->transValue)); } pergroupstate->transValue = newVal; --- 812,830 ---- if (!fcinfo->isnull) { MemoryContextSwitchTo(aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! newVal = datumTransfer(newVal, ! pertrans->transtypeByVal, ! pertrans->transtypeLen); } if (!pergroupstate->transValueIsNull) ! { ! if (DatumIsReadWriteExpandedObject(pergroupstate->transValue, ! false, ! pertrans->transtypeLen)) ! DeleteExpandedObject(pergroupstate->transValue); ! else ! pfree(DatumGetPointer(pergroupstate->transValue)); ! } } pergroupstate->transValue = newVal; *************** advance_combine_function(AggState *aggst *** 1020,1028 **** { oldContext = MemoryContextSwitchTo( aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! pergroupstate->transValue = datumCopy(fcinfo->arg[1], pertrans->transtypeByVal, ! pertrans->transtypeLen); MemoryContextSwitchTo(oldContext); } else --- 1039,1047 ---- { oldContext = MemoryContextSwitchTo( aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! pergroupstate->transValue = datumTransfer(fcinfo->arg[1], pertrans->transtypeByVal, ! pertrans->transtypeLen); MemoryContextSwitchTo(oldContext); } else *************** advance_combine_function(AggState *aggst *** 1043,1049 **** /* * OK to call the combine function */ ! fcinfo->arg[0] = pergroupstate->transValue; fcinfo->argnull[0] = pergroupstate->transValueIsNull; fcinfo->isnull = false; /* just in case combine func doesn't set it */ --- 1062,1073 ---- /* * OK to call the combine function */ ! if (pertrans->transfnAllowRW) ! fcinfo->arg[0] = pergroupstate->transValue; ! else ! fcinfo->arg[0] = MakeExpandedObjectReadOnly(pergroupstate->transValue, ! pergroupstate->transValueIsNull, ! pertrans->transtypeLen); fcinfo->argnull[0] = pergroupstate->transValueIsNull; fcinfo->isnull = false; /* just in case combine func doesn't set it */ *************** advance_combine_function(AggState *aggst *** 1062,1073 **** if (!fcinfo->isnull) { MemoryContextSwitchTo(aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! newVal = datumCopy(newVal, ! pertrans->transtypeByVal, ! pertrans->transtypeLen); } if (!pergroupstate->transValueIsNull) ! pfree(DatumGetPointer(pergroupstate->transValue)); } pergroupstate->transValue = newVal; --- 1086,1104 ---- if (!fcinfo->isnull) { MemoryContextSwitchTo(aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory); ! newVal = datumTransfer(newVal, ! pertrans->transtypeByVal, ! pertrans->transtypeLen); } if (!pergroupstate->transValueIsNull) ! { ! if (DatumIsReadWriteExpandedObject(pergroupstate->transValue, ! false, ! pertrans->transtypeLen)) ! DeleteExpandedObject(pergroupstate->transValue); ! else ! pfree(DatumGetPointer(pergroupstate->transValue)); ! } } pergroupstate->transValue = newVal; *************** finalize_aggregate(AggState *aggstate, *** 1333,1339 **** (void *) aggstate, NULL); /* Fill in the transition state value */ ! fcinfo.arg[0] = pergroupstate->transValue; fcinfo.argnull[0] = pergroupstate->transValueIsNull; anynull |= pergroupstate->transValueIsNull; --- 1364,1372 ---- (void *) aggstate, NULL); /* Fill in the transition state value */ ! fcinfo.arg[0] = MakeExpandedObjectReadOnly(pergroupstate->transValue, ! pergroupstate->transValueIsNull, ! pertrans->transtypeLen); fcinfo.argnull[0] = pergroupstate->transValueIsNull; anynull |= pergroupstate->transValueIsNull; *************** finalize_aggregate(AggState *aggstate, *** 1360,1366 **** } else { ! *resultVal = pergroupstate->transValue; *resultIsNull = pergroupstate->transValueIsNull; } --- 1393,1401 ---- } else { ! *resultVal = MakeExpandedObjectReadOnly(pergroupstate->transValue, ! pergroupstate->transValueIsNull, ! pertrans->transtypeLen); *resultIsNull = pergroupstate->transValueIsNull; } *************** finalize_partialaggregate(AggState *aggs *** 1410,1416 **** { FunctionCallInfo fcinfo = &pertrans->serialfn_fcinfo; ! fcinfo->arg[0] = pergroupstate->transValue; fcinfo->argnull[0] = pergroupstate->transValueIsNull; *resultVal = FunctionCallInvoke(fcinfo); --- 1445,1453 ---- { FunctionCallInfo fcinfo = &pertrans->serialfn_fcinfo; ! fcinfo->arg[0] = MakeExpandedObjectReadOnly(pergroupstate->transValue, ! pergroupstate->transValueIsNull, ! pertrans->transtypeLen); fcinfo->argnull[0] = pergroupstate->transValueIsNull; *resultVal = FunctionCallInvoke(fcinfo); *************** finalize_partialaggregate(AggState *aggs *** 1419,1425 **** } else { ! *resultVal = pergroupstate->transValue; *resultIsNull = pergroupstate->transValueIsNull; } --- 1456,1464 ---- } else { ! *resultVal = MakeExpandedObjectReadOnly(pergroupstate->transValue, ! pergroupstate->transValueIsNull, ! pertrans->transtypeLen); *resultIsNull = pergroupstate->transValueIsNull; } *************** build_pertrans_for_aggref(AggStatePerTra *** 3029,3034 **** --- 3068,3077 ---- &pertrans->transtypeLen, &pertrans->transtypeByVal); + /* detect whether we'd like to pass read/write expanded pointers */ + pertrans->transfnAllowRW = (aggtransfn == F_ARRAY_APPEND || + aggtransfn == F_ARRAY_PREPEND); + if (OidIsValid(aggserialfn)) { build_aggregate_serialfn_expr(aggserialfn,
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers