Re: Partial aggregates pushdown

Tom Lane Mon, 27 Nov 2023 13:00:02 -0800

Robert Haas <[email protected]> writes:
> Also, I want to make one other point here about security and
> reliability. Right now, there is no way for a user to feed arbitrary
> data to a deserialization function. Since serialization and
> deserialization functions are only used in the context of parallel
> query, we always know that the data fed to the deserialization
> function must have come from the serialization function on the same
> machine. Nor can users call the deserialization function directly with
> arbitrary data of their own choosing, because users cannot call
> functions that take or return internal. But with this system, it
> becomes possible to feed arbitrary data to a deserialization function.


Ouch.  That is absolutely horrid --- we have a lot of stuff that
depends on users not being able to get at "internal" values, and
it sounds like the current proposal breaks all of that.

Quite aside from security concerns, there is no justification for
assuming that the "internal" values used on one platform/PG version
are identical to those used on another.  So if the idea is to
ship back "internal" values from the remote server to the local one,
I think it's basically impossible to make that work.

Even if the partial-aggregate serialization value isn't "internal"
but some more-narrowly-defined type, it is still an internal
implementation detail of the aggregate.  You have no right to assume
that the remote server implements the aggregate the same way the
local one does.  If we start making such an assumption then we'll
be unable to revise the implementation of an aggregate ever again.

TBH, I think this entire proposal is dead in the water.  Which is
sad from a performance standpoint, but I can't see any way that
we would not regret shipping a feature that makes such assumptions.

                        regards, tom lane

Re: Partial aggregates pushdown

Reply via email to