I wrote: > Meanwhile, I'll go see if I can work out exactly what's broken about the > polymorphic type-slinging. That I would like to see working in beta2.
OK, I've got at least the core of the problem. fix_combine_agg_expr changes the upper aggregate's Aggref node so that its args list is a single Var of the transtype. This has nothing whatsoever to do with the original user-supplied aggregate arguments, not as to datatype nor even number of arguments. Nonetheless, nodeAgg.c uses that args list as input to get_aggregate_argtypes(), and thence to resolve_aggregate_transtype() and build_aggregate_finalfn_expr(), as well as some miscellaneous checks in build_pertrans_for_aggref (though by luck it appears that those checks aren't reachable for a combining Aggref). It's basically just luck that this ever works; we'd have noticed it long ago if it weren't that so few of the aggregates that currently have combinefns are either polymorphic or multiple-argument. Likely the only polymorphic one that ever got tested is anyarray_agg, and that accidentally fails to fail because (1) enforce_generic_type_consistency has special-case exceptions for ANYARRAY, and (2) the array-mashing code doesn't actually need to know which array type it's dealing with since it finds that out by looking into the array values. We could fix the resolve_aggregate_transtype aspect of this by storing the resolved transtype into Aggref nodes sometime before whacking them into partial format, rather than re-deducing it during nodeAgg startup. I'm inclined to do that just on efficiency grounds; but it's not enough to fix the problem here, because we'd still be passing the wrong argument types to build_aggregate_finalfn_expr(), and any finalfunc that actually used that information would fail. It seems like the best way to fix this is to add another field to Aggref that is an OID list of the original argument types, which we don't change when replacing the argument list for a combining agg, and to look to that rather than the physical args list when extracting the argument types in nodeAgg.c. This is kind of bulky and grotty, but it should work reliably. I also thought about leaving the original args list in place, and having a secondary args list used only for combining Aggrefs that contains just the Var referencing the subsidiary transtype result. An attraction of that is that we'd not need that Rube-Goldberg-worthy method for printing combining Aggrefs in ruleutils.c. However, the cruft would still leak out somewhere, because if we just did that much then setrefs.c would want to find all the aggregate input values in the output of the subsidiary plan node, where of course they aren't and shouldn't be. Maybe we could skip resolving the original args list into subplan references in this case, but that might well break ruleutils, so it just seems like a mess. Yet another idea is to have nodeAgg do something similar to what ruleutils does and drill down through the subsidiary-transtype-result Var to find the partial Aggref and grab its argument list. But that's just doubling down on a mechanism that we should be looking to get rid of not emulate. So at this point my proposal is: 1. Add an OID-list field to Aggref holding the data types of the user-supplied arguments. This can be filled in parse analysis since it won't change thereafter. Replace calls to get_aggregate_argtypes() with use of the OID-list field, or maybe keep the function but have it look at the OID list not the physical args list. 2. Add an OID field to Aggref holding the resolved (non polymorphic) transition data type's OID. For the moment we could just set this during parse analysis, since we do not support changing the transtype of an existing aggregate. If we ever decide we want to allow that, the lookup could be postponed into the planner. Replace calls to resolve_aggregate_transtype with use of this field. (resolve_aggregate_transtype() wouldn't disappear, but it would be invoked only once during parse analysis, not multiple times per query.) #2 isn't necessary to fix the bug, but as long as we are doing #1 we might as well do #2 too, to buy back some of the planner overhead added by parallel query. Barring objections I'll make this happen by tomorrow. I still don't like anything about the way that the matching logic in fix_combine_agg_expr works, but at the moment I can't point to any observable bug from that, so I'll not try to change it before beta2. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers