I don't have a full test case yet, but I did finally manage to get an explain analyze to finish in a sane amount of time on 8.2.5. Attached are two cleaned up explain analyze results, using the exact same data directory but different executables: one is 8.2.3 and returns as expected, the other is 8.2.5, which generates a slow plan despite any fiddling with geqo/join_collapse_limit, etc. The cost is the same, but it makes a wrong turn partway through the plan. This 8.2.5 has the earlier patches from Tom already posted applied to it (also tested on 8.2.4 and 8.2.5 with the same result). An earlier version of the query with tables truncated to 100000 rows ran in 70 seconds on 8.2.3, and did not finish after an hour on 8.2.5. This version has the tables truncated to a mere 10000 rows each. The query itself is a view calling some large views, which call other views and functions, etc. I can post a version of it if needed, but my energy is mostly focused now on making a reproducible test case. Now that this is locally reproducible in a finite amount of time, patches and tweaking suggestions are welcome. (Postgres built with no special flags from source, all tables have been analyzed, '***' in the explain analyze plans indicates places manually made things more readable).
Run on a 8.2.3 server: Aggregate (C=6832..6832 R=1) (AT=1665..1665 R=1 L=1) ->Hash Join (C=5007..6832 R=50) (AT=1543..1657 R=9340 L=1) Hash Cond: ("substring"((a.order_number)::text, '^[^.]+'::text) = (a.base_order)::text) ->GroupAggregate (C=2519..4194 R=10000) (AT=101..158 R=10000 L=1) ->Sort (C=2519..2544 R=10000) (AT=101..109 R=10001 L=1) Sort Key: o.order_number, (** many other "o" columns) ->Merge Left Join (C=0..1855 R=10000) (AT=0..61 R=10001 L=1) Merge Cond: ((o.order_number)::text = (ol.order_number)::text) ->Index Scan using orders_smaller_i1 on orders_smaller o (C=0..1147 R=10000) (AT=0..12 R=10000 L=1) ->Index Scan using orderlines_smaller_i4 on orderlines_smaller ol (C=0..577 R=8440) (AT=0..10 R=8435 L=1) ->Hash (C=2487..2487 R=1) (AT=1442..1442 R=9247 L=1) ->Merge Join (cost=2487.46..2487.49 rows=1 width=36) (actual time=1376.130..1431.848 rows=9247 loops=1) Merge Cond: ("outer"."?column2?" = "inner"."?column2?") ->Sort (C=1405..1405 R=1) (AT=1186..1193 R=9247 L=1) Sort Key: (a.base_order)::text ->Subquery Scan a (cost=1405.91..1405.94 rows=1 width=18) (actual time=1112.757..1135.560 rows=9247 loops=1) ->HashAggregate (C=1405..1405 R=1) (AT=1112..1121 R=9247 L=1) ->Nested Loop Left Join (C=18..1405 R=5) (AT=0..1095 R=9248 L=1) ->Nested Loop Left Join (C=18..1404 R=5) (AT=0..1017 R=9247 L=1) Join Filter: ((o.order_number)::text ~ (('^'::text || (m.order_number)::text) || '(\\.[0-9.]+)?$'::text)) ->Seq Scan on orders_smaller m (C=0..742 R=1) (AT=0..11 R=9247 L=1) Filter: ((order_number)::text !~~ '%.%'::text) ->Bitmap Heap Scan on orders_smaller o (C=18..634 R=1111) (AT=0..0 R=1 L=9247) Recheck Cond: (((o.order_number)::text >= (m.order_number)::text) AND ((o.order_number)::text <= ((m.order_number)::text || '/'::text))) Filter: (((status)::text <> 'split'::text) AND ((status)::text <> 'canceled'::text) AND ((status)::text <> 'incomplete'::text) AND ((status)::text <> 'fraud'::text)) ->Bitmap Index Scan on orders_smaller_i1 (C=0..17 R=1111) (AT=0..0 R=1 L=9247) Index Cond: (((o.order_number)::text >= (m.order_number)::text) AND ((o.order_number)::text <= ((m.order_number)::text || '/'::text))) ->Index Scan using orderlines_smaller_i4 on orderlines_smaller ol (C=0..0 R=1) (AT=0..0 R=0 L=9247) Index Cond: ((o.order_number)::text = (ol.order_number)::text) ->Sort (C=1081..1081 R=1) (AT=189..196 R=9247 L=1) Sort Key: (b.base_order)::text ->Subquery Scan b (cost=1081.46..1081.50 rows=1 width=18) (actual time=116.554..139.130 rows=9247 loops=1) ->HashAggregate (C=1081..1081 R=1) (AT=116..125 R=9247 L=1) ->Nested Loop Left Join (C=17..1081 R=3) (AT=0..101 R=9247 L=1) Join Filter: ((p.order_number)::text ~ (('^'::text || (m.order_number)::text) || '(\\.[0-9.]+)?$'::text)) ->Seq Scan on orders_smaller m (C=0..742 R=1) (AT=0..10 R=9247 L=1) Filter: ((order_number)::text !~~ '%.%'::text) ->Bitmap Heap Scan on payments_smaller p (C=17..311 R=1111) (AT=0..0 R=0 L=9247) Recheck Cond: (((p.order_number)::text >= (m.order_number)::text) AND ((p.order_number)::text <= ((m.order_number)::text || '/'::text))) Filter: *** ((transaction_type IN ('A','B','C','D') AND (amount <> 0::numeric)) ->Bitmap Index Scan on payments_smaller_i2 (C=0..17 R=1111) (AT=0..0 R=0 L=9247) Index Cond: (((p.order_number)::text >= (m.order_number)::text) AND ((p.order_number)::text <= ((m.order_number)::text || '/'::text))) Total runtime: 1666.259 ms
Run on an 8.2.5 server, with recent patches. Same results when run on unpatched 8.2.5, and on 8.2.4: Aggregate (C=6832..6832 R=1) (AT=144302..144302 R=1 L=1) ->Hash Join (C=5007..6832 R=50) (AT=144184..144295 R=9340 L=1) Hash Cond: ("substring"((a.order_number)::text, '^[^.]+'::text) = (a.base_order)::text) ->GroupAggregate (C=2519..4194 R=10000) (AT=103..161 R=10000 L=1) ->Sort (C=2519..2544 R=10000) (AT=103..111 R=10001 L=1) Sort Key: o.order_number, (** many other "o" columns) ->Merge Left Join (C=0..1855 R=10000) (AT=0..64 R=10001 L=1) Merge Cond: ((o.order_number)::text = (ol.order_number)::text) ->Index Scan using orders_smaller_i1 on orders_smaller o (C=0..1147 R=10000) (AT=0..13 R=10000 L=1) ->Index Scan using orderlines_smaller_i4 on orderlines_smaller ol (C=0..577 R=8440) (AT=0..10 R=8435 L=1) ->Hash (C=2487..2487 R=1) (AT=144080..144080 R=9247 L=1) ->Nested Loop (C=2487..2487 R=1) (AT=1255..144069 R=9247 L=1) Join Filter: ((a.base_order)::text = (b.base_order)::text) ->HashAggregate (C=1405..1405 R=1) (AT=1130..1143 R=9247 L=1) ->Nested Loop Left Join (C=18..1405 R=5) (AT=0..1111 R=9248 L=1) ->Nested Loop Left Join (C=18..1404 R=5) (AT=0..1033 R=9247 L=1) Join Filter: ((o.order_number)::text ~ (('^'::text || (m.order_number)::text) || '(\\.[0-9.]+)?$'::text)) ->Seq Scan on orders_smaller m (C=0..742 R=1) (AT=0..12 R=9247 L=1) Filter: ((order_number)::text !~~ '%.%'::text) ->Bitmap Heap Scan on orders_smaller o (C=18..634 R=1111) (AT=0..0 R=1 L=9247) Recheck Cond: (((o.order_number)::text >= (m.order_number)::text) AND ((o.order_number)::text <= ((m.order_number)::text || '/'::text))) Filter: (((status)::text <> 'split'::text) AND ((status)::text <> 'canceled'::text) AND ((status)::text <> 'incomplete'::text) AND ((status)::text <> 'fraud'::text)) ->Bitmap Index Scan on orders_smaller_i1 (C=0..17 R=1111) (AT=0..0 R=1 L=9247) Index Cond: (((o.order_number)::text >= (m.order_number)::text) AND ((o.order_number)::text <= ((m.order_number)::text || '/'::text))) ->Index Scan using orderlines_smaller_i4 on orderlines_smaller ol (C=0..0 R=1) (AT=0..0 R=0 L=9247) Index Cond: ((o.order_number)::text = (ol.order_number)::text) ->HashAggregate (C=1081..1081 R=1) (AT=0..8 R=9247 L=9247) ->Nested Loop Left Join (C=17..1081 R=3) (AT=0..109 R=9247 L=1) Join Filter: ((p.order_number)::text ~ (('^'::text || (m.order_number)::text) || '(\\.[0-9.]+)?$'::text)) ->Seq Scan on orders_smaller m (C=0..742 R=1) (AT=0..11 R=9247 L=1) Filter: ((order_number)::text !~~ '%.%'::text) ->Bitmap Heap Scan on payments_smaller p (C=17..311 R=1111) (AT=0..0 R=0 L=9247) Recheck Cond: (((p.order_number)::text >= (m.order_number)::text) AND ((p.order_number)::text <= ((m.order_number)::text || '/'::text))) Filter: *** ((transaction_type IN ('A','B','C','D') AND (amount <> 0::numeric)) ->Bitmap Index Scan on payments_smaller_i2 (C=0..17 R=1111) (AT=0..0 R=0 L=9247) Index Cond: (((p.order_number)::text >= (m.order_number)::text) AND ((p.order_number)::text <= ((m.order_number)::text || '/'::text))) Total runtime: 144303.615 ms
---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings