improve EXPLAIN for wide tables

Sami Imseih Mon, 16 Dec 2024 08:08:16 -0800

Recently, a user reported that running an explain for a query joining
many wide tables taking more than 1 minute to complete. Running the
query without explain takes only a few seconds.


Further research showed that this is similar to a report from
2018 [1]. colname_is_unique is used to assign unique
column names during deparse and this has O(N^2) behavior.

The good news is this behavior was optimized with commit [2] with the
help of a hash table. While the main stated purpose of this commit was to
improved eparse for views and rules, it also significantly improved
EXPLAIN which goes through the same routine via ExplainPrintPlan and
deparse_context_for_plan_tree.

Looking further into this improvement, I started to question if it is
necessary to make columns unique for EXPLAIN purposes?

An experimental patch ( attached ) to skip making columns unique
for EXPLAIN breaks only one test case for columns with default
column names from a function [3]. I have not had success
finding other cases that break.

src/test/regress/results/rangefuncs.out
@@ -2130,10 +2130,10 @@

 explain (verbose, costs off)
 select * from testrngfunc();
-                        QUERY PLAN
-----------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Subquery Scan on "*SELECT*"
-   Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1"
+   Output: "*SELECT*"."?column?", "*SELECT*"."?column?"
    ->  Unique
          Output: (1), (2)
          ->  Sort

While the performance impact will likely only be noticeable
on rare use cases, removing the make columns unique
work appears to be a good idea overall. There are probably
a few good ways to fix the broken case above.

running the attached measure.sql
query involving 60 tables with 150 columns

# 17.2 without commit [2]
Time: 58950.505 ms
Time: 472.912 ms

## HEAD with commit [2]
Time: 1229.704 ms
Time: 473.494 ms

## without uniquifying ( attached experimental patch )
Time: 499.185 ms
Time: 473.118 ms

[1] 
https://www.postgresql.org/message-id/flat/1537818224423-0.post%40n3.nabble.com#5d23ed9ab9cb5ed45c79352141fa3e79
[2] 
https://github.com/postgres/postgres/commit/52c707483ce4d0161127e4958d981d1b5655865e
[3] 
https://github.com/postgres/postgres/blob/master/src/test/regress/sql/rangefuncs.sql#L586-L588

Regards,

Sami Imseih
Amazon Web Services (AWS)

measure.sql
Description: Binary data

experiment-no-unique-column-deparse.patch
Description: Binary data

improve EXPLAIN for wide tables

Reply via email to