[HACKERS] 27 second plan times

Gregory Stark Fri, 20 Apr 2007 13:21:09 -0700

Following up on some complaints we've had about extremely large plan times for
large partitioned tables I've been doing some profiling. I've constructed a
situation where it takes upwards of 30 seconds to plan a simple query like:


postgres=# explain select * from a where i between 999 and 1001;
                              QUERY PLAN                              
----------------------------------------------------------------------
 Result  (cost=0.00..8.63 rows=58 width=108)
   ->  Append  (cost=0.00..8.63 rows=58 width=108)
         ->  Seq Scan on a  (cost=0.00..1.01 rows=1 width=108)
               Filter: ((i >= 999) AND (i <= 1001))
         ->  Seq Scan on a997 a  (cost=0.00..1.27 rows=1 width=108)
               Filter: ((i >= 999) AND (i <= 1001))
         ->  Seq Scan on a998 a  (cost=0.00..1.27 rows=1 width=108)
               Filter: ((i >= 999) AND (i <= 1001))
         ->  Seq Scan on a999 a  (cost=0.00..1.27 rows=18 width=108)
               Filter: ((i >= 999) AND (i <= 1001))
         ->  Seq Scan on a1000 a  (cost=0.00..1.27 rows=18 width=108)
               Filter: ((i >= 999) AND (i <= 1001))
         ->  Seq Scan on a1001 a  (cost=0.00..1.27 rows=18 width=108)
               Filter: ((i >= 999) AND (i <= 1001))
         ->  Seq Scan on a1002 a  (cost=0.00..1.27 rows=1 width=108)
               Filter: ((i >= 999) AND (i <= 1001))
(16 rows)

Time: 46324.627 ms

Table "a" is a parent table with 2,000 partitions each of which have 102
columns, two of which are covered by constraints of the form "WHERE i BETWEEN
90 AND 110".

The gprof output is pretty damning:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 62.58    203.76   203.76 21,474,937     0.00     0.00  SearchCatCache
  9.01    233.08    29.32  1,923,789     0.00     0.00  list_nth_cell
  5.34    250.46    17.38  2386465     0.00     0.00  pgstat_initstats
  2.80    259.57     9.11 235691652     0.00     0.00  AllocSetAlloc
  1.95    265.91     6.34 219852840     0.00     0.00  nocachegetattr
  1.19    269.78     3.87 256569078     0.00     0.00  FunctionCall2
  0.74    272.18     2.40 107923848     0.00     0.00  
MemoryContextAllocZeroAligned


The SearchCatCache here is the one in get_attavgwidth called to estimate the
relation width. There are 200k attributes being measured here but I'm not
clear why it's causing 21M calls.

The first thing that comes to mind is that we're doing the
constraint_exclusion code *after* estimating the width of the relations we're
going to exclude. If we push the constraint exclusion up a few lines the
planning time goes down the 1.7s.

I think there's still a problem here with some kind of n^2 behaviour for
appends of very wide tables but I haven't quite nailed it yet. In any case is
there any reason not to make the following small change to move the constraint
exclusion ahead of the size estimates and index checks and save ourselves
potentially a lot of work?

Index: allpaths.c
===================================================================
RCS file: 
/home/stark/src/REPOSITORY/pgsql/src/backend/optimizer/path/allpaths.c,v
retrieving revision 1.161
diff -c -r1.161 allpaths.c
*** allpaths.c  22 Feb 2007 22:00:23 -0000      1.161
--- allpaths.c  20 Apr 2007 18:12:40 -0000
***************
*** 196,215 ****
  static void
  set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
  {
-       /* Mark rel with estimated output rows, width, etc */
-       set_baserel_size_estimates(root, rel);
- 
-       /* Test any partial indexes of rel for applicability */
-       check_partial_indexes(root, rel);
- 
-       /*
-        * Check to see if we can extract any restriction conditions from join
-        * quals that are OR-of-AND structures.  If so, add them to the rel's
-        * restriction list, and recompute the size estimates.
-        */
-       if (create_or_index_quals(root, rel))
-               set_baserel_size_estimates(root, rel);
- 
        /*
         * If we can prove we don't need to scan the rel via constraint 
exclusion,
         * set up a single dummy path for it.  (Rather than inventing a special
--- 196,201 ----
***************
*** 228,233 ****
--- 214,233 ----
                return;
        }
  
+       /* Mark rel with estimated output rows, width, etc */
+       set_baserel_size_estimates(root, rel);
+ 
+       /* Test any partial indexes of rel for applicability */
+       check_partial_indexes(root, rel);
+ 
+       /*
+        * Check to see if we can extract any restriction conditions from join
+        * quals that are OR-of-AND structures.  If so, add them to the rel's
+        * restriction list, and recompute the size estimates.
+        */
+       if (create_or_index_quals(root, rel))
+               set_baserel_size_estimates(root, rel);
+ 
        /*
         * Generate paths and add them to the rel's pathlist.
         *




-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

[HACKERS] 27 second plan times

Reply via email to