Hello, I reworked a patch to make more stable in different cases. I decided to use simplify_function instead of eval_const_expression to prevent inlining of the function. The only possible outputs of the simplify_function are Const node and NULL.
Also, I block pre-evaluation of functions with types other than TYPTYPE_BASE, cause there is no special logic for compound (and others) values yet. There is still a problem with memory leak in case of simplified arguments. The only way I see is a creation of temporary memory context, but it cost some performance. Maybe we can store simplified arguments in the pointed function itself for later use. But eval_const_expression and friends doesn't change the content of the nodes inside the tree, it generates new nodes and returns it as a result. The last point to mention is a fixed plan for the query in the initial letter of the thread. As I mentioned before, new versions of the patch replace var not with a function call, but with a function execution result. After the patch, the following plan is used instead of Nested Loop with Sequence Scan: explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('english', 'tuple&header&overhead') q where body_tsvector @@ q limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------- Limit (cost=224.16..266.11 rows=3 width=36) -> Nested Loop (cost=224.16..266.11 rows=3 width=36) -> Function Scan on q (cost=0.00..0.01 rows=1 width=0) -> Bitmap Heap Scan on messages (cost=224.16..266.04 rows=3 width=275) Recheck Cond: (body_tsvector @@ '''tupl'' & ''header'' & ''overhead'''::tsquery) -> Bitmap Index Scan on message_body_idx (cost=0.00..224.16 rows=3 width=0) Index Cond: (body_tsvector @@ '''tupl'' & ''header'' & ''overhead'''::tsquery) -- Aleksandr Parfenov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 505ae0af85..2c9983004a 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -3655,6 +3655,59 @@ eval_const_expressions_mutator(Node *node, context); } break; + case T_Var: + if (context->root && context->root->parse->rtable) + { + Var *var; + Query *query; + RangeTblEntry *pointedNode; + + var = (Var *)node; + query = context->root->parse; + + if (var->varlevelsup != 0 || var->varattno != 1) + break; + + pointedNode = list_nth(query->rtable, var->varno - 1); + Assert(IsA(pointedNode, RangeTblEntry)); + + if (pointedNode->rtekind == RTE_FUNCTION && list_length(pointedNode->functions) == 1) + { + Form_pg_type type_form; + Node *result; + RangeTblFunction *tblFunction = linitial_node(RangeTblFunction, pointedNode->functions); + FuncExpr *expr = (FuncExpr *) tblFunction->funcexpr; + List *args = expr->args; + HeapTuple type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(expr->funcresulttype)); + + if (!HeapTupleIsValid(type_tuple)) + elog(ERROR, "cache lookup failed for type %u", expr->funcresulttype); + + type_form = (Form_pg_type) GETSTRUCT(type_tuple); + + if (type_form->typtype != TYPTYPE_BASE) + { + ReleaseSysCache(type_tuple); + break; + } + + result = simplify_function(expr->funcid, + expr->funcresulttype, + exprTypmod(expr), + expr->funccollid, + expr->inputcollid, + &args, + expr->funcvariadic, + true, + false, + context); + + ReleaseSysCache(type_tuple); + + if (result) /* successfully simplified it */ + return (Node *) result; + } + break; default: break; } diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index cbc882d47b..eb92f556d8 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -3021,23 +3021,23 @@ select * from int4(sin(1)) q1, int4(sin(0)) q2 where q1 = thousand or q2 = thousand; - QUERY PLAN ------------------------------------------------------------------------- - Hash Join - Hash Cond: (tenk1.twothousand = int4_tbl.f1) + QUERY PLAN +------------------------------------------------------------------------- + Nested Loop -> Nested Loop - -> Nested Loop - -> Function Scan on q1 - -> Function Scan on q2 - -> Bitmap Heap Scan on tenk1 - Recheck Cond: ((q1.q1 = thousand) OR (q2.q2 = thousand)) - -> BitmapOr - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: (q1.q1 = thousand) - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: (q2.q2 = thousand) - -> Hash - -> Seq Scan on int4_tbl + -> Hash Join + Hash Cond: (tenk1.twothousand = int4_tbl.f1) + -> Bitmap Heap Scan on tenk1 + Recheck Cond: ((1 = thousand) OR (0 = thousand)) + -> BitmapOr + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (1 = thousand) + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (0 = thousand) + -> Hash + -> Seq Scan on int4_tbl + -> Function Scan on q1 + -> Function Scan on q2 (15 rows) explain (costs off) @@ -3046,20 +3046,20 @@ select * from int4(sin(1)) q1, int4(sin(0)) q2 where thousand = (q1 + q2); - QUERY PLAN --------------------------------------------------------------- - Hash Join - Hash Cond: (tenk1.twothousand = int4_tbl.f1) + QUERY PLAN +------------------------------------------------------------------- + Nested Loop -> Nested Loop - -> Nested Loop - -> Function Scan on q1 - -> Function Scan on q2 - -> Bitmap Heap Scan on tenk1 - Recheck Cond: (thousand = (q1.q1 + q2.q2)) - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: (thousand = (q1.q1 + q2.q2)) - -> Hash - -> Seq Scan on int4_tbl + -> Hash Join + Hash Cond: (tenk1.twothousand = int4_tbl.f1) + -> Bitmap Heap Scan on tenk1 + Recheck Cond: (thousand = 1) + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (thousand = 1) + -> Hash + -> Seq Scan on int4_tbl + -> Function Scan on q1 + -> Function Scan on q2 (12 rows) --