Re: About Custom Aggregates, C Extensions and Memory

Marthin Laubscher Sat, 16 Aug 2025 09:05:22 -0700

Tom Lane < [email protected]> writes:

> Yeah. What you want is to declare the aggregate as having transtype
> "internal" (which basically means that ExecAgg will store a pointer
> for you) and make that pointer point to a data structure kept in the
> "aggcontext", which will have a suitable lifespan. json_agg() might
> be a suitable example to look at. Keep in mind that the finalfn
> mustn't modify the stored state, as there are optimizations where
> it'll be applied more than once.


Thank you.

Stunned, once more, by the incredible depth of prior art in the Postgres 
ecosystem, allow me to ask before underestimating it again.
 
I was fixing on (ab)using user-defined aggregates for a concept that would, on 
second thought, be better described as a key-only table persisted and 
manipulated as a single opaque value that is only ever expanded to a 
table/set/collection in memory. Initial focus is on storing 8-byte integer IDs 
but bigger UUIDs might be needed down the line. 

I've leaned away from involving any composite or table semantics for no better 
reason than inadequate understanding of in-memory table opportunities and 
limitations. I ended up planning a user-defined type with the last base-type on 
the list - pseudo type - which includes "internal" that I would define the 
aggregate to produce values for.

Then I read your mail, some more documentation, and discovered
a) Moving-Aggregate Mode,
b) which has a separate implementation to account for forward and inverse 
transition (which is relevant to my use case),
d) accessible mainly via window functions (which are not that relevant to my 
use case),
d) doesn't cover my "other" other operations including set-operations like 
intersection, difference, union, 
e) which all would benefit from being able to reuse the same decoded into 
memory representation as the aggregate.

Could PostgreSQL be as brilliantly put together that the internal pseudo base 
type for a user defined type and the internal transtype in a user defined 
aggregate are one and the same thing, essentially extending the lifespan and 
utility of the in-memory version further towards what I really need?

What would be your thoughts and advice here? Are there more prior-art gems that 
tamed this wilderness before? Am I justifiably or too easily scared by 
in-memory table semantics and implications when what I'm dealing with really 
amount to sets rather than (opaque) scalar values? I definitely want to store 
values representing an entire set at time in a column value, there's no place 
for names - the opaque value IS the identity of the set (mathematic bijection), 
which runs orthogonal to the original Sequel, now SQL, but are there other 
facilities around I'm duplicating here or have enough in common with to 
leverage?

Thanks again for your time and your utterly unique and dedicated approach 
keeping such firm yet friendly control of PostgreSQL. It is truly appreciated 
and noticed, especially in the chaotic context of open source. 8 years your 
junior I spent a life as design authority, but in a corporate setting, meaning 
I find dealing with open-source people worse than herding cats, so I think 
you're doing incredibly well. Thank you.

Regards,
Marthin Laubscher

Re: About Custom Aggregates, C Extensions and Memory

Reply via email to