There's a thread over in pgsql-performance http://archives.postgresql.org/pgsql-performance/2011-07/msg00046.php in which the main conclusion was that we need to take a fresh look at the heuristics the planner uses when dealing with small or empty relations. The code in question is in estimate_rel_size() in plancat.c:
curpages = RelationGetNumberOfBlocks(rel); /* * HACK: if the relation has never yet been vacuumed, use a * minimum estimate of 10 pages. This emulates a desirable aspect * of pre-8.0 behavior, which is that we wouldn't assume a newly * created relation is really small, which saves us from making * really bad plans during initial data loading. (The plans are * not wrong when they are made, but if they are cached and used * again after the table has grown a lot, they are bad.) It would * be better to force replanning if the table size has changed a * lot since the plan was made ... but we don't currently have any * infrastructure for redoing cached plans at all, so we have to * kluge things here instead. * * We approximate "never vacuumed" by "has relpages = 0", which * means this will also fire on genuinely empty relations. Not * great, but fortunately that's a seldom-seen case in the real * world, and it shouldn't degrade the quality of the plan too * much anyway to err in this direction. */ if (curpages < 10 && rel->rd_rel->relpages == 0) curpages = 10; That comment is of 8.0 vintage, and it needs to be updated, because it's now the case that there *is* an automatic path for refreshing plans when table sizes change. Once the number of updates exceeds the auto-analyze threshold, autovac will run an ANALYZE, which will update the relation's pg_class row, which will force a relcache inval, which will cause the plancache.c code to mark any cached plans using the relation as needing to be rebuilt. So that raises the question of whether we shouldn't just drop the if-statement entirely. I experimented with that a bit, and soon found that it resulted in some probably-undesirable changes in the regression test results. In particular the planner seemed to be switching from indexscan to seqscan plans for accesses to very small tables, which may not be a good tradeoff. I'm a bit loath to twiddle the behavior here without extensive testing, since for the most part we've not had many complaints about the planner's behavior for small tables. Another reason not to rely completely on the auto-analyze update path is that it doesn't work for temp tables, since autovac can't access another session's temp tables. It's also worth noting that auto-analyze will never kick in at all for tables of less than about 60 rows, unless there is update/delete traffic on them. The issue that came up in pgsql-performance was that this if-statement was firing for an empty inheritance parent table, causing a scan on the parent to look significantly more expensive than it really was, and in fact a good bit more expensive than the actually-useful index probe on its child table. This bogus estimate thus discouraged the planner from using a nestloop with the partitioned table on the inside, which in reality was the most appropriate plan. So we could tackle that issue with a pretty narrowly-focused test: disable the if-statement for inheritance parent tables, a la if (curpages < 10 && rel->rd_rel->relpages == 0 && !rel->rd_rel->relhassubclass) curpages = 10; This is justifiable on the grounds that an inheritance parent table is much more likely to be meant to stay empty than an ordinary table. Another thing that struck me while looking at the code is that the curpages clamp is applied to indexes too, which seems like a thinko. A table occupying a few pages wouldn't likely have an index as big as the table itself is. So what I'm currently thinking about is a change like this: if (curpages < 10 && rel->rd_rel->relpages == 0 && !rel->rd_rel->relhassubclass && rel->rd_rel->relkind != RELKIND_INDEX) curpages = 10; plus a suitable rewrite of the comment. This seems like a safe enough change to apply to 9.1. Going forward we might want to change it more, but I think it'd require some real-world testing. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers