Thanks for doing further analysis on this. On Mon, 26 Apr 2021 at 20:31, Yuya Watari <watari.y...@gmail.com> wrote: > Thank you for running experiments on your machine and I really > appreciate your deep analysis. > > Your results are very interesting. In 5 queries, the result cache is > cheaper but slower. Especially, in query 88, although the cost with > result cache is cheaper, it has 34.23% degradation in query execution > time. This is big regression.
That's certainly one side of it. On the other side, it's pretty important to also note that in 4 of 23 queries the result cache plan executed faster but the planner costed it as more expensive. I'm not saying the costing is perfect, but what I am saying is, as you noted above, in 5 of 23 queries the result cache was cheaper and slower, and, as I just noted, in 4 of 23 queries, result cache was more expensive and faster. We know that costing is never going to be a perfect representation of what the execution time will be However, in these examples, we've just happened to get quite a good balance. If we add a penalty to result cache then it'll just subtract from one problem group and add to the other. Overall, in my tests execution was 1.15% faster with result cache enabled than it was without. I could maybe get on board with adding a small fixed cost penalty. I'm not sure exactly what it would be, maybe a cpu_tuple_cost instead of a cpu_operator_cost and count it in for forming/deforming cached tuples. I think the patch you wrote to add the resultcache_cost_factor is only suitable for running experiments with. The bigger concerns I have with the costing are around what happens when an n_distinct estimate is far too low on one of the join columns. I think it is more likely to be concerns like that one which would cause us to default enable_resultcache to off. David