Hi hackers, There is a strange behavior of the query planner in some cases if stable/immutable was used a relation. In some cases, it affects costs of operations and leads to a bad plan of the execution. Oleg Bartunov noticed such behavior in queries with a to_tsvector as a relation:
=# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q; QUERY PLAN ------------------------------------------------------------------------------------------ Nested Loop (cost=383.37..58547.70 rows=4937 width=36) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=32) -> Bitmap Heap Scan on messages (cost=383.12..58461.04 rows=4937 width=275) Recheck Cond: (body_tsvector @@ q.q) -> Bitmap Index Scan on message_body_idx (cost=0.00..381.89 rows=4937 width=0) Index Cond: (body_tsvector @@ q.q) (6 rows) =# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q limit 10; QUERY PLAN -------------------------------------------------------------------------------- Limit (cost=0.25..425.62 rows=10 width=36) -> Nested Loop (cost=0.25..210005.80 rows=4937 width=36) Join Filter: (messages.body_tsvector @@ q.q) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=32) -> Seq Scan on messages (cost=0.00..197625.45 rows=987445 width=275) The idea of the fix for this situation is to check is a result of the function constant or not during the planning of the query. Attached patch does this by processing Var entries at planner stage and replace them with constant value if it is possible. Plans after applying a patch (SeqScan query for comparison): =# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------- Limit (cost=224.66..268.11 rows=3 width=36) -> Nested Loop (cost=224.66..268.11 rows=3 width=36) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=0) -> Bitmap Heap Scan on messages (cost=224.41..267.04 rows=3 width=275) Recheck Cond: (body_tsvector @@ to_tsquery('tuple&header&overhead'::text)) -> Bitmap Index Scan on message_body_idx (cost=0.00..224.41 rows=3 width=0) Index Cond: (body_tsvector @@ to_tsquery('tuple&header&overhead'::text)) (7 rows) =# set enable_bitmapscan=off; SET =# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q limit 10; QUERY PLAN ------------------------------------------------------------------------------------------ Limit (cost=1000.25..296754.14 rows=3 width=36) -> Gather (cost=1000.25..296754.14 rows=3 width=36) Workers Planned: 2 -> Nested Loop (cost=0.25..295753.32 rows=1 width=36) -> Parallel Seq Scan on messages (cost=0.00..295752.80 rows=1 width=275) Filter: (body_tsvector @@ to_tsquery('tuple&header&overhead'::text)) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=0) (7 rows) -- Aleksandr Parfenov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 505ae0af85..410a14ed95 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -3655,6 +3655,40 @@ eval_const_expressions_mutator(Node *node, context); } break; + case T_Var: + if (context->root && context->root->parse->rtable) + { + Var *var; + Query *query; + RangeTblEntry *pointedNode; + + var = (Var *)node; + query = context->root->parse; + + if (var->varlevelsup != 0) + break; + + pointedNode = list_nth(query->rtable, var->varno - 1); + Assert(IsA(pointedNode, RangeTblEntry)); + + if (pointedNode->rtekind == RTE_FUNCTION && pointedNode->functions->length == 1) + { + Node *result; + RangeTblFunction *tblFunction = pointedNode->functions->head->data.ptr_value; + + Assert(IsA(tblFunction, RangeTblFunction)); + result = eval_const_expressions(context->root, tblFunction->funcexpr); + + if (result->type == T_FuncExpr) + { + pfree(result); + return node; + } + + return result; + } + } + break; default: break; } diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index cbc882d47b..eb92f556d8 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -3021,23 +3021,23 @@ select * from int4(sin(1)) q1, int4(sin(0)) q2 where q1 = thousand or q2 = thousand; - QUERY PLAN ------------------------------------------------------------------------- - Hash Join - Hash Cond: (tenk1.twothousand = int4_tbl.f1) + QUERY PLAN +------------------------------------------------------------------------- + Nested Loop -> Nested Loop - -> Nested Loop - -> Function Scan on q1 - -> Function Scan on q2 - -> Bitmap Heap Scan on tenk1 - Recheck Cond: ((q1.q1 = thousand) OR (q2.q2 = thousand)) - -> BitmapOr - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: (q1.q1 = thousand) - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: (q2.q2 = thousand) - -> Hash - -> Seq Scan on int4_tbl + -> Hash Join + Hash Cond: (tenk1.twothousand = int4_tbl.f1) + -> Bitmap Heap Scan on tenk1 + Recheck Cond: ((1 = thousand) OR (0 = thousand)) + -> BitmapOr + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (1 = thousand) + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (0 = thousand) + -> Hash + -> Seq Scan on int4_tbl + -> Function Scan on q1 + -> Function Scan on q2 (15 rows) explain (costs off) @@ -3046,20 +3046,20 @@ select * from int4(sin(1)) q1, int4(sin(0)) q2 where thousand = (q1 + q2); - QUERY PLAN --------------------------------------------------------------- - Hash Join - Hash Cond: (tenk1.twothousand = int4_tbl.f1) + QUERY PLAN +------------------------------------------------------------------- + Nested Loop -> Nested Loop - -> Nested Loop - -> Function Scan on q1 - -> Function Scan on q2 - -> Bitmap Heap Scan on tenk1 - Recheck Cond: (thousand = (q1.q1 + q2.q2)) - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: (thousand = (q1.q1 + q2.q2)) - -> Hash - -> Seq Scan on int4_tbl + -> Hash Join + Hash Cond: (tenk1.twothousand = int4_tbl.f1) + -> Bitmap Heap Scan on tenk1 + Recheck Cond: (thousand = 1) + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (thousand = 1) + -> Hash + -> Seq Scan on int4_tbl + -> Function Scan on q1 + -> Function Scan on q2 (12 rows) -- diff --git a/src/test/regress/expected/rangefuncs.out b/src/test/regress/expected/rangefuncs.out index 34ca0ef890..973242c33c 100644 --- a/src/test/regress/expected/rangefuncs.out +++ b/src/test/regress/expected/rangefuncs.out @@ -1978,7 +1978,7 @@ select x from int8_tbl, extractq2(int8_tbl) f(x); QUERY PLAN ------------------------------------------ Nested Loop - Output: f.x + Output: int8_tbl.q2 -> Seq Scan on public.int8_tbl Output: int8_tbl.q1, int8_tbl.q2 -> Function Scan on f