On 10/15/21 21:31, Stephen Frost wrote:
Greetings,
* Tomas Vondra (tomas.von...@enterprisedb.com) wrote:
On 10/15/21 17:05, Alexander Pyhalov wrote:
Tomas Vondra писал 2021-10-15 17:56:
And then we should extend this for aggregates with more complex
internal states (e.g. avg), by supporting a function that "exports"
the aggregate state - similar to serial/deserial functions, but needs
to be portable.
I think the trickiest thing here is rewriting the remote query to call
this export function, but maybe we could simply instruct the remote
node to use a different final function for the top-level node?
If we have some special export function, how should we find out that
remote server supports this? Should it be server property or should it
somehow find out it while connecting to the server?
Good question. I guess there could be some initial negotiation based on
remote node version etc. And we could also disable this pushdown for older
server versions, etc.
Yeah, I'd think we would just only support it on versions where we know
it's available. That doesn't seem terribly difficult.
Yeah.
But maybe Alexander was concerned about cases where the nodes disagree
on the aggregate definition, so one node might have the export function
and the other would not. E.g. the remote node may have older version of
an extension implementing the aggregate, without the export function
(although the server version supports it). I don't think we can do much
about that, it's just one of many issues that may be caused by
mismatching schemas.
I wonder if this might get more complex, though. Imagine for example a
partitioned table on node A with a FDW partition, pointing to a node B.
But on B, the object is partitioned again, with one partition placed on
C. So it's like
A -> partition on B -> partition on C
When planning on A, we can consider server version on B. But what if C
is an older version, not supporting the export function?
Bot sure if this makes any difference, though ... in the worst case it
will error out, and we should have a way to disable the feature on A.
But after that, I think we can treat this just like other definitions
between local/remote node - we'd assume they match (i.e. the remote server
has the export function), and then we'd get an error if it does not. If you
need to use remote nodes without an export function, you'd have to disable
the pushdown.
AFAICS this works both for case with explicit query rewrite (i.e. we send
SQL with calls to the export function) and implicit query rewrite (where the
remote node uses a different finalize function based on mode, specified by
GUC).
Not quite sure where to drop this, but I've always figured we'd find a
way to use the existing PartialAgg / FinalizeAggregate bits which are
used for parallel query when it comes to pushing down to foreign servers
to perform aggregates. That also gives us how to serialize the results,
though we'd have to make sure that works across different
architectures.. I've not looked to see if that's the case today.
It sure is similar to what serial/deserial functions do for partial
aggs, but IIRC the functions were not designed to be portable. I think
we don't even require compatibility across minor releases, because we
only use this to copy data between workers running at the same time. Not
saying it can't be made to work, of course.
Then again, being able to transform an aggregate into a partial
aggregate that runs as an actual SQL query would mean we do partial
aggregate push-down against non-PG FDWs and that'd be pretty darn neat,
so maybe that's a better way to go, if we can figure out how.
(I mean, for avg it's pretty easy to just turn that into a SELECT that
grabs the sum and the count and use that.. other aggregates are more
complicated though and that doesn't work, maybe we need both?)
Maybe, but that seems like a very different concept - transforming the
SQL so that it calculates different set of aggregates that we know can
be pushed down easily. But I don't recall any other practical example
beyond the AVG() -> SUM()/COUNT(). Well, VAR() can be translated into
SUM(X), SUM(X^2).
Another thing is how many users would actually benefit from this. I
mean, for this to matter you need partitioned table with partitions
placed on a non-PG FDW, right? Seems like a pretty niche use case.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company