I was surprised to see that my back-patches of the recent SubLink unpleasantness were failing on many of the buildfarm members, but only in the 9.1 and 9.0 branches. The difficulty appears to be that the EXPLAIN output for the new test query changes depending on whether or not "tenk1" has been analyzed yet. In 9.2 and up, it reliably has been, because create_index runs first and that script does this:
create_index.sql:901:vacuum analyze tenk1; -- ensure we get consistent plans here But the older branches lack that. Running the tests serially usually fails in 9.1 and 9.0, and likely would fail in 8.4 except that that branch isn't printing the selected plan for lack of EXPLAIN (COSTS OFF). Parallel tests sometimes succeed (and did for me), because the subselect test runs concurrently with "aggregates" and "join", which have aggregates.sql:211:analyze tenk1; -- ensure we get consistent plans here join.sql:333:analyze tenk1; -- ensure we get consistent plans here so depending on timing, one of those might have gotten the job done, or maybe autovacuum would show up in time to save the day. We need a more consistent strategy for this :-( The minimum-change strategy for getting the buildfarm green again would be to insert another ad-hoc "analyze tenk1" into subselect.sql in the back branches. I don't particularly want to fix it that way, though, because it'd just be a problem waiting to happen anytime someone back-patches a bug fix that includes EXPLAIN output. What I think would be the best strategy, on the whole, is to put a whole-database "ANALYZE;" at the end of the "copy" regression test, which is the one that loads up tenk1 and the other large test tables. It also comes after the tests that load up small static tables such as int4_tbl. This would ensure that all the tables that we typically use for one-off EXPLAIN tests are analyzed early in the proceedings. Then we could get rid of the various ad-hoc analyzes that have snuck into various tests. While I'm on the subject ... I noticed that the recently-added matview test has this: matview.sql:133:VACUUM ANALYZE; This doesn't make me happy. Aside from the sheer waste of cycles involved in re-analyzing the entire regression database, this test runs in parallel with half a dozen others, and it could cause plan instability in those. Of course, if it does, then most likely those tests have a hazard from autovacuum anyway. But this still looks to me like a poor bit of test design. Anyway, bottom line is that I think we need to institute, and back-patch, some consistent scheme for when to analyze the standard tables during the regression tests, so that we don't have hazards like this for tests that want to check what plan gets selected. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers