Re: Use extended statistics to estimate (Var op Var) clauses

Tomas Vondra Fri, 20 Aug 2021 12:32:29 -0700



On 8/20/21 8:56 PM, Robert Haas wrote:

On Fri, Aug 20, 2021 at 2:21 PM Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:

After looking at this for a while, it's clear the main issue is handling
of clauses referencing the same Var twice, like for example (a = a) or
(a < a). But it's not clear to me if this is something worth fixing, or
if extended statistics is the right place to do it.

If those clauses are worth the effort, why not to handle them better
even without extended statistics? We can easily evaluate these clauses
on per-column MCV, because they only reference a single Var.


+1.

It seems to me that what we ought to do is make "a < a", "a > a", and
"a != 0" all have an estimate of zero, and make "a <= a", "a >= a",
and "a = a" estimate 1-nullfrac. The extended statistics mechanism can
just ignore the first three types of clauses; the zero estimate has to
be 100% correct. It can't necessarily ignore the second three cases,
though. If the query says "WHERE a = a AND b = 1", "b = 1" may be more
or less likely given that a is known to be not null, and extended
statistics can tell us that.

Yeah, I agree this seems like the right approach (except I guess youmeant "a != a" and not "a != 0"). Assuming we want to do something aboutthese clauses at all - I'm still wondering if those clauses are commonin practice or just synthetic.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Use extended statistics to estimate (Var op Var) clauses

Reply via email to