Jehan-Guillaume de Rorthais <j...@dalibo.com> writes: > I faced a correlation problem on a query today and tried the usual trick > consisting of using an functional index and rewriting the query to use it.
The core reason this isn't doing anything useful is that clause_selectivity() is hard-wired to estimate the selectivity of a top-level WHERE clause that is a function call as 0.3333333, no matter what: else if (is_funcclause(clause)) { /* * This is not an operator, so we guess at the selectivity. THIS IS A * HACK TO GET V4 OUT THE DOOR. FUNCS SHOULD BE ABLE TO HAVE * SELECTIVITIES THEMSELVES. -- JMH 7/9/92 */ s1 = (Selectivity) 0.3333333; } Adding per-function selectivity estimators, as Joe was presumably envisioning, would be a sufficiently large amount of work that it's not too surprising nobody's gotten around to it in twenty-three years. (The infrastructure maybe wouldn't be so bad, but where would the estimators themselves come from, especially for user-defined functions?) However, in the case at hand, the complaint basically is why aren't we treating the boolean function expression like a boolean variable, and looking to see if there are stats available for it, like this other bit in clause_selectivity: /* * A Var at the top of a clause must be a bool Var. This is * equivalent to the clause reln.attribute = 't', so we compute * the selectivity as if that is what we have. */ s1 = restriction_selectivity(root, BooleanEqualOperator, list_make2(var, makeBoolConst(true, false)), InvalidOid, varRelid); Indeed you could argue that this ought to be the fallback behavior for *any* unhandled case, not just function expressions. Not sure if we'd need to restrict it to single-relation expressions or not. The implication of doing it like this would be that the default estimate in the absence of any matching stats would be 0.5 (since eqsel defaults to 1/ndistinct, and get_variable_numdistinct will report 2.0 for any boolean-type expression it has no stats for). That's not a huge change from the existing 0.3333333 estimate, which seems pretty unprincipled anyway ... but it would probably be enough to annoy people if we did it in stable branches. So I'd be inclined to propose changing this in HEAD and maybe 9.5, but not further back. (For non-function expressions, 0.5 is the default already, so those would not change behavior.) Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers