On Tue, 25 Aug 2020 at 08:26, Andres Freund <and...@anarazel.de> wrote: > While I'm against introducing a separate node for the caching, I'm *not* > against displaying a different node type when caching is > present. E.g. it'd be perfectly reasonable from my POV to have a 'Cached > Nested Loop' join and a plain 'Nested Loop' node in the above node. I'd > probably still want to display the 'Cache Key' similar to your example, > but I don't see how it'd be better to display it with one more > intermediary node.
...Well, this is difficult... For the record, in case anyone missed it, I'm pretty set on being against doing any node overloading for this. I think it's a pretty horrid modularity violation regardless of what text appears in EXPLAIN. I think if we merge these nodes then we may as well go further and merge in other simple nodes like LIMIT. Then after a few iterations of that, we end up with with a single node in EXPLAIN that nobody can figure out what it does. "Here Be Dragons", as Tom said. That might seem like a bit of an exaggeration, but it is important to understand that this would start us down that path, and the more steps you take down that path, the harder it is to return from it. Let's look at nodeProjectSet.c, for example, which I recall you spent quite a bit of time painfully extracting the scattered logic to get it into a single reusable node (69f4b9c85). I understand your motivation was for JIT compilation and not to modularise the code, however, I think the byproduct of that change of having all that code in one executor node was a good change, and I'm certainly a fan of what it allowed you to achieve with JIT. I really wouldn't like to put anyone else in a position of having to extract out some complex logic that we add to existing nodes in some future version of PostgreSQL. It might seem quite innocent today, but add a few more years of development and I'm sure things will get buried a little deeper. I'm sure you know better than most that the day will come where we go and start rewriting all of our executor node code to implement something like batch execution. I'd imagine you'd agree that this job would be easier if nodes were single-purpose, rather than overloaded with a bunch of needless complexity that only Heath Robinson himself could be proud of. I find it bizarre that on one hand, for non-parameterized nested loops, we can have the inner scan become materialized with a Materialize node (I don't recall complaints about that) However, on the other hand, for parameterized nested loops, we build the caching into the Nested Loop node itself. For the other arguments: I'm also struggling a bit to understand the arguments that it makes EXPLAIN easier to read due to reduced nesting depth. If that's the case, why don't we get rid of Hash below a Hash Join? It seems nobody has felt strongly enough about that to go to the trouble of writing the patch. We could certainly do work to reduce nesting depth in EXPLAIN provided you're not big on what the plan actually does. One line should be ok if you're not worried about what's happening to your tuples. Unfortunately, that does not seem very useful as it tends to be that people who do look at EXPLAIN do actually want to know what the planner has decided to do and are interested in what's happening to their tuples. Hiding away details that can significantly impact the performance of the plan does not seem like a great direction to be moving in. Also, just in case anyone is misunderstanding this Andres' argument. It's entirely based on the performance impact of having an additional node. However, given the correct planner choice, there will never be a gross slowdown due to having the extra node. The costing, the way it currently is designed will only choose to use a Result Cache if it thinks it'll be cheaper to do so and cheaper means having enough cache hits for the caching overhead to be worthwhile. If we get a good cache hit ratio then the additional node overhead does not exist during execution since we don't look any further than the cache during a cache hit. It would only be a cache miss that requires pulling the tuples through an additional node. Given perfect statistics (which of course is impossible) and costs, we'll never slow down the execution of a plan by having a separate Result Cache node. In reality, poor statistics, e.g, massive n_distinct underestimations, could cause slowdowns, but loading this into one node is not going to save us from that. All that your design will save us from is that 12 nanosecond per-tuple hop (measured on a 5-year-old laptop) to an additional node during cache misses. It seems like a strange thing to optimise for, given that the planner only chooses to use a Result Cache when there's a good number of expected cache hits. I understand that you've voiced your feelings about this, but what I want to know is, how strongly do you feel about overloading the node? Will you stand in my way if I want to push ahead with the separate node? Will anyone else? David