Recently, a user reported that running an explain for a query joining many wide tables taking more than 1 minute to complete. Running the query without explain takes only a few seconds.
Further research showed that this is similar to a report from 2018 [1]. colname_is_unique is used to assign unique column names during deparse and this has O(N^2) behavior. The good news is this behavior was optimized with commit [2] with the help of a hash table. While the main stated purpose of this commit was to improved eparse for views and rules, it also significantly improved EXPLAIN which goes through the same routine via ExplainPrintPlan and deparse_context_for_plan_tree. Looking further into this improvement, I started to question if it is necessary to make columns unique for EXPLAIN purposes? An experimental patch ( attached ) to skip making columns unique for EXPLAIN breaks only one test case for columns with default column names from a function [3]. I have not had success finding other cases that break. src/test/regress/results/rangefuncs.out @@ -2130,10 +2130,10 @@ explain (verbose, costs off) select * from testrngfunc(); - QUERY PLAN ----------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------- Subquery Scan on "*SELECT*" - Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1" + Output: "*SELECT*"."?column?", "*SELECT*"."?column?" -> Unique Output: (1), (2) -> Sort While the performance impact will likely only be noticeable on rare use cases, removing the make columns unique work appears to be a good idea overall. There are probably a few good ways to fix the broken case above. running the attached measure.sql query involving 60 tables with 150 columns # 17.2 without commit [2] Time: 58950.505 ms Time: 472.912 ms ## HEAD with commit [2] Time: 1229.704 ms Time: 473.494 ms ## without uniquifying ( attached experimental patch ) Time: 499.185 ms Time: 473.118 ms [1] https://www.postgresql.org/message-id/flat/1537818224423-0.post%40n3.nabble.com#5d23ed9ab9cb5ed45c79352141fa3e79 [2] https://github.com/postgres/postgres/commit/52c707483ce4d0161127e4958d981d1b5655865e [3] https://github.com/postgres/postgres/blob/master/src/test/regress/sql/rangefuncs.sql#L586-L588 Regards, Sami Imseih Amazon Web Services (AWS)
measure.sql
Description: Binary data
experiment-no-unique-column-deparse.patch
Description: Binary data