On Fri, Jul 5, 2019 at 4:31 AM Pavel Stehule <pavel.steh...@gmail.com> wrote: > The first issue is unstable regress tests - there is a problem with opr_sanity
I would prefer to avoid needing to add anything to opr_sanity really. A multirange would let me achieve that I think. But otherwise I'll add the ordering. Thanks! > + rangeTypeId = get_fn_expr_argtype(fcinfo->flinfo, 1); > + if (!type_is_range(rangeTypeId)) > + { > + ereport(ERROR, (errmsg("range_agg must be called with a range"))); > + } I think this was left over from my attempts to support user-defined ranges. I believe I can remove it now. > +# Making 2- and 3-param range_agg polymorphic is difficult > +# because it would take an anyrange and return an anyrange[], > +# which doesn't exist. > +# As a workaround we define separate functions for each built-in range type. > +# This is what causes the mess in src/test/regress/expected/opr_sanity.out. > +{ oid => '8003', descr => 'aggregate transition function', > > This is strange. Does it means so range_agg will not work with custom range > types? You would have to define your own range_agg with your own custom type in the signature. I gave instructions about that in my range_agg extension (https://github.com/pjungwir/range_agg), but it's not as easy with range_agg in core because I don't think you can define a custom function that is backed by a built-in C function. At least I couldn't get that to work. The biggest argument for a multirange to me is that we could have anymultirange, with similar rules as anyarray. That way I could have `range_agg(r anyrange) RETURNS anymultirange`. [1] is a conversation where I asked about doing this before multiranges were suggested. Also I'm aware of your own recent work on polymorphic types at [2] but I haven't read it closely enough to see if it would help me here. Do you think it applies? > I am not sure about implementation. I think so accumulating all ranges, > sorting and processing on the end can be memory and CPU expensive. I did some research and couldn't find any algorithm that was better than O(n log n), but if you're aware of any I'd like to know about it. Assuming we can't improve on the complexity bounds, I think a sort + iteration is desirable because it keeps things simple and benefits from past & future work on the sorting code. I care more about optimizing time here than RAM, especially since we'll use this in FK checks, where the inputs will rarely be more than a few and probably never in the millions. We especially want to avoid an O(n^2) algorithm, which would be easy to stumble into if we naively accumulated the result as we go. For example if we had multiranges you could imagine an implementation that just did `result + r` for all inputs r. But that would have the same n^2 pattern as looping over strcat. We could use a tree to keep things sorted and accumulate as we go, but the implementation would be more complicated and I think slower. Thanks for the review! Paul [1] https://www.postgresql.org/message-id/CA%2BrenyVOjb4xQZGjdCnA54-1nzVSY%2B47-h4nkM-EP5J%3D1z%3Db9w%40mail.gmail.com [2] https://www.postgresql.org/message-id/cafj8prdna7vqni8gr+tt2ktmz0cq5g93guc3sbn_nvpldxa...@mail.gmail.com