Re: range_agg

Paul Jungwirth Wed, 04 Mar 2020 14:27:27 -0800

Thanks for looking at this again!

On 3/4/20 1:33 PM, Alvaro Herrera wrote:

I came across an interesting thing, namely multirange_canonicalize()'s
use of qsort_arg with a callback of range_compare().  range_compare()
calls range_deserialize() (non-trivial parsing) for each input range;
multirange_canonicalize() later does a few extra deserialize calls of
its own.  Call me a premature optimization guy if you will, but I think
it makes sense to have a different struct (let's call it
"InMemoryRange") which stores the parsed representation of each range;
then we can deserialize all ranges up front, and use that as many times
as needed, without having to deserialize each range every time.

I don't know, this sounds like a drastic change. I agree thatmultirange_deserialize and range_deserialize do a lot of copying (notreally any parsing though, and they both assume their inputs are alreadyde-TOASTED). But they are used very extensively, so if you wanted toremove them you'd have to rewrite a lot.

I interpreted the intention of range_deserialize to be a way to keep therange struct fairly "private" and give a standard interface toextracting its attributes. Its motive seems akin to deconstruct_array.So I wrote multirange_deserialize to follow that principle. Bothfunctions also handle memory alignment issues for you. Withmultirange_deserialize, there isn't actually much structure (just thelist of ranges), so perhaps you could more easily omit it and givecallers direct access into the multirange contents. That still seemsrisky though, and less well encapsulated.

My preference would be to see if these functions are really aperformance problem first, and only redo the in-memory structures ifthey are. Also that seems like something you could do as a separateproject. (I wouldn't mind working on it myself, although I'd prefer todo actual temporal database features first.) There are nobackwards-compatibility concerns to changing the in-memory structure,right? (Even if there are, it's too late to avoid them for ranges.)

While I'm at this, why not name the new file simply multiranges.c
instead of multirangetypes.c?

As someone who doesn't do a lot of Postgres hacking, I tried to followthe approach in rangetypes.c as closely as I could, especially fornaming things. So I named the file multirangetypes.c because there wasalready rangetypes.c. But also I can see how the "types" emphasizes thatranges and multiranges are not concrete types themselves, but more likeabstract data types or generics (like arrays).


Yours,

--
Paul              ~{:-)
p...@illuminatedcomputing.com

Re: range_agg

Reply via email to