On Sun, Mar 15, 2020 at 12:37:37PM +0000, Dean Rasheed wrote:
On Sun, 15 Mar 2020 at 00:08, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:

On Sat, Mar 14, 2020 at 05:56:10PM +0100, Tomas Vondra wrote:
>
>Attached is a patch series rebased on top of the current master, after
>committing the ScalarArrayOpExpr enhancements. I've updated the OR patch
>to get rid of the code duplication, and barring objections I'll get it
>committed shortly together with the two parts improving test coverage.

I've pushed the two patches improving test coverage for functional
dependencies and MCV lists, which seems mostly non-controversial. I'll
wait a bit more with the two patches actually changing behavior (rebased
version attached, to keep cputube happy).


Patch 0001 looks to be mostly ready. Just a couple of final comments:

+       if (is_or)
+           simple_sel = clauselist_selectivity_simple_or(root,
stat_clauses, varRelid,
+                                                         jointype,
sjinfo, NULL, 1.0);
+       else

Surely that should be passing 0.0 as the final argument, otherwise it
will always just return simple_sel = 1.0.


+        *
+        * XXX We can't multiply with current value, because for OR clauses
+        * we start with 0.0, so we simply assign to s1 directly.
+        */
+       s = statext_clauselist_selectivity(root, clauses, varRelid,
+                                          jointype, sjinfo, rel,
+                                          &estimatedclauses, true);

That final part of the comment is no longer relevant (variable s1 no
longer exists). Probably it could now just be deleted, since I think
there are sufficient comments elsewhere to explain what's going on.

Otherwise it looks good, and I think this will lead to some very
worthwhile improvements.


Attached is a rebased patch series, addressing both those issues.

I've been wondering why none of the regression tests failed because of
the 0.0 vs. 1.0 issue, but I think the explanation is pretty simple - to
make the tests stable, all the MCV lists we use are "perfect" i.e. it
represents 100% of the data. But this selectivity is used to compute
selectivity only for the part not represented by the MCV list, i.e. it's
not really used. I suppose we could add a test that would use larger
MCV item, but I'm afraid that'd be inherently unstable :-(

Another thing I was thinking about is the changes to the API. We need to
pass information whether the clauses are connected by AND or OR to a
number of places, and 0001 does that in two ways. For some functions it
adds a new parameter (called is_or), and for other functiosn it creates
a new copy of a function. So for example

  - statext_mcv_clauselist_selectivity
  - statext_clauselist_selectivity

got the new flag, while e.g. clauselist_selectivity gets a new "copy"
sibling called clauselist_selectivity_or.

There were two reasons for not using flag. First, clauselist_selectivity
and similar functions have to do very different stuff for these two
cases, so it'd be just one huge if/else block. Second, minimizing
breakage of third-party code - pretty much all the extensions I've seen
only work with AND clauses, and call clauselist_selectivity. Adding a
flag would break that code. (Also, there's a bit of laziness, because
this was the simplest thing to do during development.)

But I wonder if that's sufficient reason - maybe we should just add the
flag in all cases. It might break some code, but the fix is trivial (add
a false there).

Opinions?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 4d1b7a071ff7ea27bb991f7999133eb8d34f4720 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <t...@fuzzy.cz>
Date: Sun, 8 Mar 2020 23:26:50 +0100
Subject: [PATCH 1/2] Improve estimation of OR clauses using extended
 statistics

Until now, OR clauses were estimated using extended statistics only when
the whole clause (all the arguments) are compatible. If even just one
argument was found to be incompatible, the whole clause was estimated
ignoring extended statistics. Estimation errors for OR clauses tend to
be fairly mild, so this was considered acceptable, but it may become an
issue for OR clauses with more complex arguments, etc.

This commit relaxes the restriction, using mostly the same logic as AND
clauses. We first apply extended statistics to as many arguments as
possible, and then use the (s1 + s2 - s1 * s2) formula to factor in the
remaining clauses.

The OR clause is still considered incompatible, though. If any argument
is unsupported or references variable not covered by the statistics, the
whole OR clause is incompatible. The consequence is that e.g. clauses

    (a = 1) AND (b = 1 OR c = 1 OR d = 1)

can't be estimated by statistics on (a,b,c) because the OR clause also
references "d". So we'll estimate each of the AND arguments separately,
and the extended statistics will be used only to estimate the OR clause.
This may be solved by creating statistics including the "d" column, but
the issue applies to cases where the clause type is unsupported, e.g.

    (a = 1) AND (b = 1 OR c = 1 OR mod(d,10) = 0)

which can't be solved by adding "d" to the statistics, at least for now.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed, Thomas Munro
Discussion: Discussion: 
https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development
---
 src/backend/optimizer/path/clausesel.c        | 141 +++++++++++++++---
 src/backend/statistics/extended_stats.c       |  36 +++--
 src/backend/statistics/mcv.c                  |   5 +-
 src/include/optimizer/optimizer.h             |   7 +
 .../statistics/extended_stats_internal.h      |   3 +-
 src/include/statistics/statistics.h           |   3 +-
 src/test/regress/expected/stats_ext.out       |  61 +++++++-
 src/test/regress/sql/stats_ext.sql            |  13 +-
 8 files changed, 227 insertions(+), 42 deletions(-)

diff --git a/src/backend/optimizer/path/clausesel.c 
b/src/backend/optimizer/path/clausesel.c
index a3ebe10592..ce14d47409 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -92,7 +92,7 @@ clauselist_selectivity(PlannerInfo *root,
                 */
                s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
                                                                                
         jointype, sjinfo, rel,
-                                                                               
         &estimatedclauses);
+                                                                               
         &estimatedclauses, false);
        }
 
        /*
@@ -104,6 +104,62 @@ clauselist_selectivity(PlannerInfo *root,
                                                                                
          estimatedclauses);
 }
 
+/*
+ * clauselist_selectivity_or -
+ *       Compute the selectivity of an implicitly-ORed list of boolean
+ *       expression clauses.  The list can be empty, in which case 0.0
+ *       must be returned.  List elements may be either RestrictInfos
+ *       or bare expression clauses --- the former is preferred since
+ *       it allows caching of results.
+ *
+ * See clause_selectivity() for the meaning of the additional parameters.
+ *
+ * The basic approach is to apply extended statistics first, on as many
+ * clauses as possible, in order to capture cross-column dependencies etc.
+ * The remaining clauses are then estimated using regular statistics tracked
+ * for individual columns.  This is done by simply passing the clauses to
+ * clauselist_selectivity and then combining the selectivities using the
+ * regular formula (s1+s2 - s1*s2).
+ */
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+                                                 List *clauses,
+                                                 int varRelid,
+                                                 JoinType jointype,
+                                                 SpecialJoinInfo *sjinfo)
+{
+       Selectivity     s = 0.0;
+       RelOptInfo *rel;
+       Bitmapset  *estimatedclauses = NULL;
+
+       /*
+        * Determine if these clauses reference a single relation.  If so, and 
if
+        * it has extended statistics, try to apply those.
+        */
+       rel = find_single_rel_for_clauses(root, clauses);
+       if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
+       {
+               /*
+                * Estimate as many clauses as possible using extended 
statistics.
+                *
+                * 'estimatedclauses' tracks the 0-based list position index of
+                * clauses that we've estimated using extended statistics, and 
that
+                * should be ignored.
+                *
+                * XXX We can't multiply with current value, because for OR 
clauses
+                * we start with 0.0, so we simply assign to 's' directly.
+                */
+               s = statext_clauselist_selectivity(root, clauses, varRelid,
+                                                                               
   jointype, sjinfo, rel,
+                                                                               
   &estimatedclauses, true);
+       }
+
+       /* Estimate the rest of the clauses as if they were independent. */
+       return clauselist_selectivity_simple_or(root, clauses, varRelid,
+                                                                               
        jointype, sjinfo,
+                                                                               
        estimatedclauses, s);
+}
+
 /*
  * clauselist_selectivity_simple -
  *       Compute the selectivity of an implicitly-ANDed list of boolean
@@ -351,6 +407,65 @@ clauselist_selectivity_simple(PlannerInfo *root,
        return s1;
 }
 
+/*
+ * clauselist_selectivity_simple_or -
+ *       Compute the selectivity of an implicitly-ORed list of boolean
+ *       expression clauses.  The list can be empty, in which case the
+ *       value passed in the last parameter must be returned.
+ *       The estimatedclauses bitmap tracks clauses that have already
+ *       been estimated by other means.
+ *
+ * See clause_selectivity() for the meaning of the additional parameters.
+ *
+ * Our basic approach is to combine selectivities of the subclauses
+ * using the formula (s1 + s2 - s1 * s2).  It's possible some of the
+ * clauses are already estimated by other means, in which case we treat
+ * them as a single logical clause, with the selecitivity passed in the
+ * last parameter.
+ */
+Selectivity
+clauselist_selectivity_simple_or(PlannerInfo *root,
+                                                                List *clauses,
+                                                                int varRelid,
+                                                                JoinType 
jointype,
+                                                                
SpecialJoinInfo *sjinfo,
+                                                                Bitmapset 
*estimatedclauses,
+                                                                Selectivity s)
+{
+       ListCell   *lc;
+       int                     listidx;
+
+       /*
+        * Selectivities of the remaining clauses for an OR clause are computed
+        * as s1+s2 - s1*s2 to account for the probable overlap of selected 
tuple
+        * sets. The clauses estimated using extended statistics are effectively
+        * treated as a single clause.
+        *
+        * XXX is this too conservative?
+        */
+       listidx = -1;
+       foreach(lc, clauses)
+       {
+               Selectivity s2;
+
+               listidx++;
+
+               /* skip already estimated clauses */
+               if (bms_is_member(listidx, estimatedclauses))
+                       continue;
+
+               s2 = clause_selectivity(root,
+                                                               (Node *) 
lfirst(lc),
+                                                               varRelid,
+                                                               jointype,
+                                                               sjinfo);
+
+               s = s + s2 - s * s2;
+       }
+
+       return s;
+}
+
 /*
  * addRangeClause --- add a new range clause for clauselist_selectivity
  *
@@ -735,24 +850,14 @@ clause_selectivity(PlannerInfo *root,
        else if (is_orclause(clause))
        {
                /*
-                * Selectivities for an OR clause are computed as s1+s2 - s1*s2 
to
-                * account for the probable overlap of selected tuple sets.
-                *
-                * XXX is this too conservative?
+                * Almost the same thing as clauselist_selectivity, but with the
+                * clauses connected by OR.
                 */
-               ListCell   *arg;
-
-               s1 = 0.0;
-               foreach(arg, ((BoolExpr *) clause)->args)
-               {
-                       Selectivity s2 = clause_selectivity(root,
-                                                                               
                (Node *) lfirst(arg),
-                                                                               
                varRelid,
-                                                                               
                jointype,
-                                                                               
                sjinfo);
-
-                       s1 = s1 + s2 - s1 * s2;
-               }
+               s1 = clauselist_selectivity_or(root,
+                                                                          
((BoolExpr *) clause)->args,
+                                                                          
varRelid,
+                                                                          
jointype,
+                                                                          
sjinfo);
        }
        else if (is_opclause(clause) || IsA(clause, DistinctExpr))
        {
diff --git a/src/backend/statistics/extended_stats.c 
b/src/backend/statistics/extended_stats.c
index 318fdb8f6d..93d67791c5 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1282,16 +1282,17 @@ statext_is_compatible_clause(PlannerInfo *root, Node 
*clause, Index relid,
 static Selectivity
 statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int 
varRelid,
                                                                   JoinType 
jointype, SpecialJoinInfo *sjinfo,
-                                                                  RelOptInfo 
*rel, Bitmapset **estimatedclauses)
+                                                                  RelOptInfo 
*rel, Bitmapset **estimatedclauses,
+                                                                  bool is_or)
 {
        ListCell   *l;
        Bitmapset **list_attnums;
        int                     listidx;
-       Selectivity     sel = 1.0;
+       Selectivity     sel = (is_or) ? 0.0 : 1.0;
 
        /* check if there's any stats that might be useful for us. */
        if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
-               return 1.0;
+               return sel;
 
        list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
                                                                                
 list_length(clauses));
@@ -1374,8 +1375,12 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, 
List *clauses, int varReli
                 * columns/clauses. We'll then use the various selectivities 
computed from
                 * MCV list to improve it.
                 */
-               simple_sel = clauselist_selectivity_simple(root, stat_clauses, 
varRelid,
-                                                                               
                jointype, sjinfo, NULL);
+               if (is_or)
+                       simple_sel = clauselist_selectivity_simple_or(root, 
stat_clauses, varRelid,
+                                                                               
                                  jointype, sjinfo, NULL, 0.0);
+               else
+                       simple_sel = clauselist_selectivity_simple(root, 
stat_clauses, varRelid,
+                                                                               
                           jointype, sjinfo, NULL);
 
                /*
                 * Now compute the multi-column estimate from the MCV list, 
along with the
@@ -1383,7 +1388,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, 
List *clauses, int varReli
                 */
                mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, 
varRelid,
                                                                                
         jointype, sjinfo, rel,
-                                                                               
         &mcv_basesel, &mcv_totalsel);
+                                                                               
         &mcv_basesel, &mcv_totalsel, is_or);
 
                /* Estimated selectivity of values not covered by MCV matches */
                other_sel = simple_sel - mcv_basesel;
@@ -1397,8 +1402,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, 
List *clauses, int varReli
                stat_sel = mcv_sel + other_sel;
                CLAMP_PROBABILITY(stat_sel);
 
-               /* Factor the estimate from this MCV to the oveall estimate. */
-               sel *= stat_sel;
+               /* Factor the estimate from this MCV to the overall estimate. */
+               if (is_or)
+                       sel = sel + stat_sel - sel * stat_sel;
+               else
+                       sel *= stat_sel;
        }
 
        return sel;
@@ -1411,13 +1419,21 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, 
List *clauses, int varReli
 Selectivity
 statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
                                                           JoinType jointype, 
SpecialJoinInfo *sjinfo,
-                                                          RelOptInfo *rel, 
Bitmapset **estimatedclauses)
+                                                          RelOptInfo *rel, 
Bitmapset **estimatedclauses,
+                                                          bool is_or)
 {
        Selectivity sel;
 
        /* First, try estimating clauses using a multivariate MCV list. */
        sel = statext_mcv_clauselist_selectivity(root, clauses, varRelid, 
jointype,
-                                                                               
         sjinfo, rel, estimatedclauses);
+                                                                               
         sjinfo, rel, estimatedclauses, is_or);
+
+       /*
+        * Functional dependencies only work for clauses connected by AND, so 
for
+        * OR clauses we're done.
+        */
+       if (is_or)
+               return sel;
 
        /*
         * Then, apply functional dependencies on the remaining clauses by 
calling
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 3147d8fedc..343011178b 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -1902,7 +1902,8 @@ mcv_clauselist_selectivity(PlannerInfo *root, 
StatisticExtInfo *stat,
                                                   List *clauses, int varRelid,
                                                   JoinType jointype, 
SpecialJoinInfo *sjinfo,
                                                   RelOptInfo *rel,
-                                                  Selectivity *basesel, 
Selectivity *totalsel)
+                                                  Selectivity *basesel, 
Selectivity *totalsel,
+                                                  bool is_or)
 {
        int                     i;
        MCVList    *mcv;
@@ -1915,7 +1916,7 @@ mcv_clauselist_selectivity(PlannerInfo *root, 
StatisticExtInfo *stat,
        mcv = statext_mcv_load(stat->statOid);
 
        /* build a match bitmap for the clauses */
-       matches = mcv_get_match_bitmap(root, clauses, stat->keys, mcv, false);
+       matches = mcv_get_match_bitmap(root, clauses, stat->keys, mcv, is_or);
 
        /* sum frequencies for all the matching MCV items */
        *basesel = 0.0;
diff --git a/src/include/optimizer/optimizer.h 
b/src/include/optimizer/optimizer.h
index 5283995df8..67adcd4485 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -64,6 +64,13 @@ extern Selectivity clauselist_selectivity_simple(PlannerInfo 
*root,
                                                                                
                 JoinType jointype,
                                                                                
                 SpecialJoinInfo *sjinfo,
                                                                                
                 Bitmapset *estimatedclauses);
+extern Selectivity clauselist_selectivity_simple_or(PlannerInfo *root,
+                                                                               
                        List *clauses,
+                                                                               
                        int varRelid,
+                                                                               
                        JoinType jointype,
+                                                                               
                        SpecialJoinInfo *sjinfo,
+                                                                               
                        Bitmapset *estimatedclauses,
+                                                                               
                        Selectivity s);
 extern Selectivity clauselist_selectivity(PlannerInfo *root,
                                                                                
  List *clauses,
                                                                                
  int varRelid,
diff --git a/src/include/statistics/extended_stats_internal.h 
b/src/include/statistics/extended_stats_internal.h
index 2b14ab238c..6c039a81c2 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -107,6 +107,7 @@ extern Selectivity mcv_clauselist_selectivity(PlannerInfo 
*root,
                                                                                
          SpecialJoinInfo *sjinfo,
                                                                                
          RelOptInfo *rel,
                                                                                
          Selectivity *basesel,
-                                                                               
          Selectivity *totalsel);
+                                                                               
          Selectivity *totalsel,
+                                                                               
          bool is_or);
 
 #endif                                                 /* 
EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h 
b/src/include/statistics/statistics.h
index f5d9b6c73a..e18c9a6539 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -116,7 +116,8 @@ extern Selectivity 
statext_clauselist_selectivity(PlannerInfo *root,
                                                                                
                  JoinType jointype,
                                                                                
                  SpecialJoinInfo *sjinfo,
                                                                                
                  RelOptInfo *rel,
-                                                                               
                  Bitmapset **estimatedclauses);
+                                                                               
                  Bitmapset **estimatedclauses,
+                                                                               
                  bool is_or);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
                                                                                
                Bitmapset **clause_attnums,
diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index bd6d8be434..3aba4a37c6 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -1075,6 +1075,12 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a = 1 OR b = '
        200 |    200
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 OR b = 
''1'' OR c = 1 OR d IS NOT NULL');
+ estimated | actual 
+-----------+--------
+       200 |    200
+(1 row)
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52) AND b IN ( ''1'', ''2'')');
  estimated | actual 
 -----------+--------
@@ -1135,13 +1141,6 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a < ALL (ARRAY
        100 |    100
 (1 row)
 
--- we can't use the statistic for OR clauses that are not fully covered 
(missing 'd' attribute)
-SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 OR b = 
''1'' OR c = 1 OR d IS NOT NULL');
- estimated | actual 
------------+--------
-       343 |    200
-(1 row)
-
 -- check change of unrelated column type does not reset the MCV statistics
 ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
 SELECT d.stxdmcv IS NOT NULL
@@ -1464,12 +1463,36 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_multi WHERE c = 0 AN
        102 |    714
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
AND c = 0');
+ estimated | actual 
+-----------+--------
+       143 |    142
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
OR c = 0');
+ estimated | actual 
+-----------+--------
+      1571 |   1572
+(1 row)
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0 AND c = 0 AND d = 0');
  estimated | actual 
 -----------+--------
          4 |    142
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE (a = 0 
AND b = 0) OR (c = 0 AND d = 0)');
+ estimated | actual 
+-----------+--------
+       298 |   1572
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
OR b = 0 OR c = 0 OR d = 0');
+ estimated | actual 
+-----------+--------
+      2649 |   1572
+(1 row)
+
 -- create separate MCV statistics
 CREATE STATISTICS mcv_lists_multi_1 (mcv) ON a, b FROM mcv_lists_multi;
 CREATE STATISTICS mcv_lists_multi_2 (mcv) ON c, d FROM mcv_lists_multi;
@@ -1486,12 +1509,36 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_multi WHERE c = 0 AN
        714 |    714
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
AND c = 0');
+ estimated | actual 
+-----------+--------
+       143 |    142
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
OR c = 0');
+ estimated | actual 
+-----------+--------
+      1571 |   1572
+(1 row)
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0 AND c = 0 AND d = 0');
  estimated | actual 
 -----------+--------
        143 |    142
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE (a = 0 
AND b = 0) OR (c = 0 AND d = 0)');
+ estimated | actual 
+-----------+--------
+      1571 |   1572
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
OR b = 0 OR c = 0 OR d = 0');
+ estimated | actual 
+-----------+--------
+      1571 |   1572
+(1 row)
+
 DROP TABLE mcv_lists_multi;
 -- Permission tests. Users should not be able to see specific data values in
 -- the extended statistics, if they lack permission to see those values in
diff --git a/src/test/regress/sql/stats_ext.sql 
b/src/test/regress/sql/stats_ext.sql
index 897ed3051c..54ceb1f4ee 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -547,6 +547,8 @@ SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists 
WHERE 4 >= a AND ''0
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 OR b = 
''1'' OR c = 1');
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 OR b = 
''1'' OR c = 1 OR d IS NOT NULL');
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52) AND b IN ( ''1'', ''2'')');
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52, NULL) AND b IN ( ''1'', ''2'', NULL)');
@@ -567,9 +569,6 @@ SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists 
WHERE a < ALL (ARRAY
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < ALL 
(ARRAY[4, 5]) AND b IN (''1'', ''2'', NULL, ''3'') AND c > ANY (ARRAY[1, 2, 
NULL, 3])');
 
--- we can't use the statistic for OR clauses that are not fully covered 
(missing 'd' attribute)
-SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 OR b = 
''1'' OR c = 1 OR d IS NOT NULL');
-
 -- check change of unrelated column type does not reset the MCV statistics
 ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
 
@@ -781,7 +780,11 @@ ANALYZE mcv_lists_multi;
 -- estimates without any mcv statistics
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE c = 0 
AND d = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
AND c = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
OR c = 0');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0 AND c = 0 AND d = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE (a = 0 
AND b = 0) OR (c = 0 AND d = 0)');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
OR b = 0 OR c = 0 OR d = 0');
 
 -- create separate MCV statistics
 CREATE STATISTICS mcv_lists_multi_1 (mcv) ON a, b FROM mcv_lists_multi;
@@ -791,7 +794,11 @@ ANALYZE mcv_lists_multi;
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE c = 0 
AND d = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
AND c = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE b = 0 
OR c = 0');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0 AND c = 0 AND d = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE (a = 0 
AND b = 0) OR (c = 0 AND d = 0)');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
OR b = 0 OR c = 0 OR d = 0');
 
 DROP TABLE mcv_lists_multi;
 
-- 
2.21.1

>From deb998bcbaac39ce90b6ad47b4828e2857ae1a57 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <t...@fuzzy.cz>
Date: Tue, 17 Mar 2020 02:57:12 +0100
Subject: [PATCH 2/2] Support clauses of the form Var op Var

---
 src/backend/statistics/extended_stats.c       | 67 ++++++++++++-----
 src/backend/statistics/mcv.c                  | 71 +++++++++++++++++-
 .../statistics/extended_stats_internal.h      |  2 +-
 src/test/regress/expected/stats_ext.out       | 72 +++++++++++++++++++
 src/test/regress/sql/stats_ext.sql            | 22 ++++++
 5 files changed, 215 insertions(+), 19 deletions(-)

diff --git a/src/backend/statistics/extended_stats.c 
b/src/backend/statistics/extended_stats.c
index 93d67791c5..f8947d9097 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -987,14 +987,18 @@ statext_is_compatible_clause_internal(PlannerInfo *root, 
Node *clause,
        {
                RangeTblEntry *rte = root->simple_rte_array[relid];
                OpExpr     *expr = (OpExpr *) clause;
-               Var                *var;
+               Var                *var,
+                                  *var2;
 
                /* Only expressions with two arguments are considered 
compatible. */
                if (list_length(expr->args) != 2)
                        return false;
 
-               /* Check if the expression has the right shape (one Var, one 
Const) */
-               if (!examine_clause_args(expr->args, &var, NULL, NULL))
+               /*
+                * Check if the expression has the right shape (one Var and one 
Const,
+                * or two Vars).
+                */
+               if (!examine_clause_args(expr->args, &var, &var2, NULL, NULL))
                        return false;
 
                /*
@@ -1034,7 +1038,20 @@ statext_is_compatible_clause_internal(PlannerInfo *root, 
Node *clause,
                        !get_func_leakproof(get_opcode(expr->opno)))
                        return false;
 
-               return statext_is_compatible_clause_internal(root, (Node *) var,
+               /*
+                * Check compatibility of the first Var - we get this one for 
both
+                * types of supported expressions (Var op Const) and (Var op 
Var).
+                */
+               if (!statext_is_compatible_clause_internal(root, (Node *) var,
+                                                                               
                   relid, attnums))
+                       return false;
+
+               /* For (Var op Const) we don't get the second Var, and we're 
done. */
+               if (!var2)
+                       return true;
+
+               /* For (Var op Var) check compatibility of the second Var. */
+               return statext_is_compatible_clause_internal(root, (Node *) 
var2,
                                                                                
                         relid, attnums);
        }
 
@@ -1050,7 +1067,7 @@ statext_is_compatible_clause_internal(PlannerInfo *root, 
Node *clause,
                        return false;
 
                /* Check if the expression has the right shape (one Var, one 
Const) */
-               if (!examine_clause_args(expr->args, &var, NULL, NULL))
+               if (!examine_clause_args(expr->args, &var, NULL, NULL, NULL))
                        return false;
 
                /*
@@ -1456,22 +1473,24 @@ statext_clauselist_selectivity(PlannerInfo *root, List 
*clauses, int varRelid,
 }
 
 /*
- * examine_opclause_expression
+ * examine_clause_args
  *             Split expression into Var and Const parts.
  *
- * Attempts to match the arguments to either (Var op Const) or (Const op Var),
- * possibly with a RelabelType on top. When the expression matches this form,
- * returns true, otherwise returns false.
+ * Attempts to match the arguments to either (Var op Const) or (Const op Var)
+ * or (Var op Var), possibly with a RelabelType on top. When the expression
+ * matches this form, returns true, otherwise returns false.
  *
  * Optionally returns pointers to the extracted Var/Const nodes, when passed
  * non-null pointers (varp, cstp and varonleftp). The varonleftp flag specifies
  * on which side of the operator we found the Var node.
  */
 bool
-examine_clause_args(List *args, Var **varp, Const **cstp, bool *varonleftp)
+examine_clause_args(List *args, Var **var1p, Var **var2p, Const **cstp,
+                                       bool *varonleftp)
 {
-       Var        *var;
-       Const  *cst;
+       Var        *var1 = NULL;
+       Var        *var2 = NULL;
+       Const  *cst = NULL;
        bool    varonleft;
        Node   *leftop,
                   *rightop;
@@ -1491,22 +1510,38 @@ examine_clause_args(List *args, Var **varp, Const 
**cstp, bool *varonleftp)
 
        if (IsA(leftop, Var) && IsA(rightop, Const))
        {
-               var = (Var *) leftop;
+               var1 = (Var *) leftop;
                cst = (Const *) rightop;
                varonleft = true;
        }
        else if (IsA(leftop, Const) && IsA(rightop, Var))
        {
-               var = (Var *) rightop;
+               var1 = (Var *) rightop;
                cst = (Const *) leftop;
                varonleft = false;
        }
+       else if (IsA(leftop, Var) && IsA(rightop, Var))
+       {
+               var1 = (Var *) leftop;
+               var2 = (Var *) rightop;
+               varonleft = false;
+
+               /*
+                * Both variables have to be for the same relation (otherwise 
it's
+                * a join clause, and we don't deal with those yet.
+                */
+               if (var1->varno != var2->varno)
+                       return false;
+       }
        else
                return false;
 
        /* return pointers to the extracted parts if requested */
-       if (varp)
-               *varp = var;
+       if (var1p)
+               *var1p = var1;
+
+       if (var2p)
+               *var2p = var2;
 
        if (cstp)
                *cstp = cst;
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 343011178b..3e58d36d68 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -1581,13 +1581,17 @@ mcv_get_match_bitmap(PlannerInfo *root, List *clauses,
 
                        /* valid only after examine_clause_args returns true */
                        Var                *var;
+                       Var                *var2;
                        Const      *cst;
                        bool            varonleft;
 
                        fmgr_info(get_opcode(expr->opno), &opproc);
 
                        /* extract the var and const from the expression */
-                       if (examine_clause_args(expr->args, &var, &cst, 
&varonleft))
+                       if (!examine_clause_args(expr->args, &var, &var2, &cst, 
&varonleft))
+                               continue;
+
+                       if (cst)        /* Var op Const */
                        {
                                int                     idx;
 
@@ -1651,6 +1655,68 @@ mcv_get_match_bitmap(PlannerInfo *root, List *clauses,
                                        matches[i] = RESULT_MERGE(matches[i], 
is_or, match);
                                }
                        }
+                       else    /* Var op Var */
+                       {
+                               int                     idx;
+                               int                     idx2;
+
+                               Assert(var2);
+
+                               /* match the attribute to a dimension of the 
statistic */
+                               idx = bms_member_index(keys, var->varattno);
+                               idx2 = bms_member_index(keys, var2->varattno);
+
+                               /*
+                                * Walk through the MCV items and evaluate the 
current clause.
+                                * We can skip items that were already ruled 
out, and
+                                * terminate if there are no remaining MCV 
items that might
+                                * possibly match.
+                                */
+                               for (i = 0; i < mcvlist->nitems; i++)
+                               {
+                                       bool            match = true;
+                                       MCVItem    *item = &mcvlist->items[i];
+
+                                       /*
+                                        * When either of the MCV items is NULL 
we can treat this
+                                        * as a mismatch. We must not call the 
operator because
+                                        * of strictness.
+                                        */
+                                       if (item->isnull[idx] || 
item->isnull[idx2])
+                                       {
+                                               matches[i] = 
RESULT_MERGE(matches[i], is_or, false);
+                                               continue;
+                                       }
+
+                                       /*
+                                        * Skip MCV items that can't change 
result in the bitmap.
+                                        * Once the value gets false for 
AND-lists, or true for
+                                        * OR-lists, we don't need to look at 
more clauses.
+                                        */
+                                       if (RESULT_IS_FINAL(matches[i], is_or))
+                                               continue;
+
+                                       /*
+                                        * First check whether the constant is 
below the lower
+                                        * boundary (in that case we can skip 
the bucket, because
+                                        * there's no overlap).
+                                        *
+                                        * We don't store collations used to 
build the statistics,
+                                        * but we can use the collation for the 
attribute itself,
+                                        * as stored in varcollid. We do reset 
the statistics after
+                                        * a type change (including collation 
change), so this is
+                                        * OK. We may need to relax this after 
allowing extended
+                                        * statistics on expressions.
+                                        */
+                                       match = 
DatumGetBool(FunctionCall2Coll(&opproc,
+                                                                               
                                   var->varcollid,
+                                                                               
                                   item->values[idx],
+                                                                               
                                   item->values[idx2]));
+
+                                       /* update the match bitmap with the 
result */
+                                       matches[i] = RESULT_MERGE(matches[i], 
is_or, match);
+                               }
+                       }
                }
                else if (IsA(clause, ScalarArrayOpExpr))
                {
@@ -1665,7 +1731,7 @@ mcv_get_match_bitmap(PlannerInfo *root, List *clauses,
                        fmgr_info(get_opcode(expr->opno), &opproc);
 
                        /* extract the var and const from the expression */
-                       if (examine_clause_args(expr->args, &var, &cst, 
&varonleft))
+                       if (examine_clause_args(expr->args, &var, NULL, &cst, 
&varonleft))
                        {
                                int                     idx;
 
@@ -1679,6 +1745,7 @@ mcv_get_match_bitmap(PlannerInfo *root, List *clauses,
 
                                /* ScalarArrayOpExpr has the Var always on the 
left */
                                Assert(varonleft);
+                               Assert(cst);
 
                                if (!cst->constisnull)
                                {
diff --git a/src/include/statistics/extended_stats_internal.h 
b/src/include/statistics/extended_stats_internal.h
index 6c039a81c2..ee34a897c8 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -96,7 +96,7 @@ extern SortItem *build_sorted_items(int numrows, int *nitems, 
HeapTuple *rows,
                                                                        
TupleDesc tdesc, MultiSortSupport mss,
                                                                        int 
numattrs, AttrNumber *attnums);
 
-extern bool examine_clause_args(List *args, Var **varp,
+extern bool examine_clause_args(List *args, Var **var1p, Var **var2p,
                                                                Const **cstp, 
bool *varonleftp);
 
 extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index 3aba4a37c6..481be40141 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -940,6 +940,18 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a = 1 OR b = '
        343 |    200
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a > c');
+ estimated | actual 
+-----------+--------
+      1667 |   3750
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < c');
+ estimated | actual 
+-----------+--------
+      1667 |      0
+(1 row)
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52) AND b IN ( ''1'', ''2'')');
  estimated | actual 
 -----------+--------
@@ -1081,6 +1093,18 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a = 1 OR b = '
        200 |    200
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a > c');
+ estimated | actual 
+-----------+--------
+      3750 |   3750
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < c');
+ estimated | actual 
+-----------+--------
+         1 |      0
+(1 row)
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52) AND b IN ( ''1'', ''2'')');
  estimated | actual 
 -----------+--------
@@ -1284,6 +1308,12 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a IS NULL AND
       3750 |   2500
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE b = d');
+ estimated | actual 
+-----------+--------
+        25 |   2500
+(1 row)
+
 -- create statistics
 CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, d FROM mcv_lists;
 ANALYZE mcv_lists;
@@ -1317,6 +1347,12 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a IS NULL AND
       2500 |   2500
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE b = d');
+ estimated | actual 
+-----------+--------
+      2500 |   2500
+(1 row)
+
 -- mcv with pass-by-ref fixlen types, e.g. uuid
 CREATE TABLE mcv_lists_uuid (
     a UUID,
@@ -1408,6 +1444,18 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_bool WHERE NOT a AND
       1094 |      0
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b');
+ estimated | actual 
+-----------+--------
+      9950 |   2500
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b AND b = c');
+ estimated | actual 
+-----------+--------
+        50 |   2500
+(1 row)
+
 CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
   FROM mcv_lists_bool;
 ANALYZE mcv_lists_bool;
@@ -1435,6 +1483,18 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_bool WHERE NOT a AND
          1 |      0
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b');
+ estimated | actual 
+-----------+--------
+      2500 |   2500
+(1 row)
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b AND b = c');
+ estimated | actual 
+-----------+--------
+      2500 |   2500
+(1 row)
+
 -- check the ability to use multiple MCV lists
 CREATE TABLE mcv_lists_multi (
        a INTEGER,
@@ -1493,6 +1553,12 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_multi WHERE a = 0 OR
       2649 |   1572
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = b 
AND c = d');
+ estimated | actual 
+-----------+--------
+         1 |   5000
+(1 row)
+
 -- create separate MCV statistics
 CREATE STATISTICS mcv_lists_multi_1 (mcv) ON a, b FROM mcv_lists_multi;
 CREATE STATISTICS mcv_lists_multi_2 (mcv) ON c, d FROM mcv_lists_multi;
@@ -1539,6 +1605,12 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_multi WHERE a = 0 OR
       1571 |   1572
 (1 row)
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = b 
AND c = d');
+ estimated | actual 
+-----------+--------
+      5000 |   5000
+(1 row)
+
 DROP TABLE mcv_lists_multi;
 -- Permission tests. Users should not be able to see specific data values in
 -- the extended statistics, if they lack permission to see those values in
diff --git a/src/test/regress/sql/stats_ext.sql 
b/src/test/regress/sql/stats_ext.sql
index 54ceb1f4ee..f5e4468129 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -498,6 +498,10 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a = 1 OR b = '
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 OR b = 
''1'' OR c = 1 OR d IS NOT NULL');
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a > c');
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < c');
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52) AND b IN ( ''1'', ''2'')');
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52, NULL) AND b IN ( ''1'', ''2'', NULL)');
@@ -549,6 +553,10 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists WHERE a = 1 OR b = '
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 OR b = 
''1'' OR c = 1 OR d IS NOT NULL');
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a > c');
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < c');
+
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52) AND b IN ( ''1'', ''2'')');
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IN (1, 2, 
51, 52, NULL) AND b IN ( ''1'', ''2'', NULL)');
@@ -656,6 +664,8 @@ SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists 
WHERE a = 1 OR b = '
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL 
AND (b = ''x'' OR d = ''x'')');
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE b = d');
+
 -- create statistics
 CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, d FROM mcv_lists;
 
@@ -674,6 +684,8 @@ SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists 
WHERE a = 1 OR b = '
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL 
AND (b = ''x'' OR d = ''x'')');
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE b = d');
+
 -- mcv with pass-by-ref fixlen types, e.g. uuid
 CREATE TABLE mcv_lists_uuid (
     a UUID,
@@ -746,6 +758,10 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_bool WHERE NOT a AND
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a 
AND b AND NOT c');
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b');
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b AND b = c');
+
 CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
   FROM mcv_lists_bool;
 
@@ -759,6 +775,10 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_bool WHERE NOT a AND
 
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a 
AND b AND NOT c');
 
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b');
+
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a = 
b AND b = c');
+
 -- check the ability to use multiple MCV lists
 CREATE TABLE mcv_lists_multi (
        a INTEGER,
@@ -785,6 +805,7 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_multi WHERE b = 0 OR
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0 AND c = 0 AND d = 0');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE (a = 0 
AND b = 0) OR (c = 0 AND d = 0)');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
OR b = 0 OR c = 0 OR d = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = b 
AND c = d');
 
 -- create separate MCV statistics
 CREATE STATISTICS mcv_lists_multi_1 (mcv) ON a, b FROM mcv_lists_multi;
@@ -799,6 +820,7 @@ SELECT * FROM check_estimated_rows('SELECT * FROM 
mcv_lists_multi WHERE b = 0 OR
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
AND b = 0 AND c = 0 AND d = 0');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE (a = 0 
AND b = 0) OR (c = 0 AND d = 0)');
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = 0 
OR b = 0 OR c = 0 OR d = 0');
+SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_multi WHERE a = b 
AND c = d');
 
 DROP TABLE mcv_lists_multi;
 
-- 
2.21.1

Reply via email to