Thanks for chipping in here. On Tue, 29 Sep 2020 at 12:08, Tom Lane <t...@sss.pgh.pa.us> wrote: > > David Rowley <dgrowle...@gmail.com> writes: > > Any opinions on this? > > This patch scares the heck out of me. It's a pretty much unprincipled > change in a fundamental selectivity estimator, which is going to affect > all sorts of queries not only the particular case you have in mind. > There's no reason to think that the outcome will be positive for other > cases, either.
hmm. Yeah, I understand your thoughts. The reason why I had thoughts that this might be an okay route to fix the problem was regarding the comment above the current good which says, "Is that a good idea?". I think I've mentioned somewhere on list about a risk-based costing model, where tag on some sort of risk factor onto a Path and have add_path() consider the risk and the cost perhaps with influence of some GUCs to help weight the decision in a certain direction. A simple version of the primary case for this is how we multiply selectivities of quals assuming no correlation. With each multiplication, we increase the risk of being wrong which would increase the risk score. How this problem tends to come out and bite people is how we end up with a selectivity that's so low we think there's 1 row and we end up doing some subsequent join as a non-parameterised nested loop join, but it turns out 1 million rows match and someone has to wait a long time for their query to finish. The risk+cost based planner would see that that's risky and maybe consider hashing that 1 row and doing a hash join. Hashing 1 row is pretty cheap, not much more expensive than nested looping, but if it blows up, the explosion is contained. Anyway, perhaps it's better to fix the more general case that I mention one day when we have some sort of risk factor in the costing model and just assign a higher risk to the seq scan path. > The idea I'd had was to adjust make_subplan and cost_subplan to estimate > EXIST cases on the basis of either 50% retrieval (same as ANY/ALL) or > maybe full retrieval if you want to be pessimistic. I've not had time > to try it out though. Yeah, I can look at that again, if you think it's more reasonable. It was the first place a landed when looking for a fix until I discovered the problem was more generic than just subplans. David