Thomas Munro <thomas.mu...@enterprisedb.com> writes: > On Wed, Mar 14, 2018 at 2:56 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: >> Is there any good way to make the regression tests fail if the plan reverts >> to the bad one? The only thing I can think of would be to make the table >> bigger so the regression tests becomes "noticeably slower", but that is >> pretty vague and not user friendly to formally pass and just hope it is slow >> enough for someone to investigate.
> I can't think of a good way. I guess it can still pick a nested loop > if it thinks there'll only be a couple of loops. This patch tells it > to pay attention to the total cost, not the startup cost, so as soon > as it thinks there is more than a hand full of rows the quadratic cost > will exceed the sort/merge's logarithmic cost. Right. After further thought, the key point here is that in non-error cases the query will produce no rows, meaning that it must be executed to completion before we can be sure of that. But applying a LIMIT encourages the planner to pick a fast-start (slow-completion) plan, which is not going to be what we want. If in fact the query were going to produce a lot of rows, and the planner could accurately predict that, then maybe a LIMIT would be useful --- but there's no hope of estimates on wholerowvar *= wholerowvar being accurate any time soon, let alone correctly handling the correlation with ctid <> ctid. So the LIMIT is just an invitation to trouble and we may as well remove it. Committed that way. I also changed EXISTS(SELECT * ...) to EXISTS(SELECT 1 ...), in hopes of saving a few microseconds of parsing effort. > Since I've had hash joins on the mind recently I couldn't help > noticing that you can't get a hash join out of this query's "record > image" based join qual (or even a regular row-based =). Yeah, because there's no hash support for recordimage. So there's even less reason to be worried about how smart the planner is for this query: basically, it might do a nestloop for a very small number of rows, but otherwise it's gonna have to go for a merge join. My earlier thought that we might be able to skip the ANALYZE step seems wrong, though. It's true that it does little for this query, but the follow-on query to build a diff table can probably make good use of the stats. regards, tom lane