Backport "WITH ... AS MATERIALIZED" syntax to <12?

2019-10-18 Thread Colin Watson
I've been struggling with how we're going to upgrade launchpad.net to
PostgreSQL 12 or newer (we're currently on 10).  We're one of those
applications that deliberately uses CTEs as optimization fences in a few
difficult places.  The provision of the MATERIALIZED keyword in 12 is
great, but the fact that it doesn't exist in earlier versions is
awkward.  We certainly don't want to upgrade our application code at the
same time as upgrading the database, and dealing with performance
degradation between the database upgrade and the application upgrade
doesn't seem great either (not to mention that it would be hard to
coordinate).  That leaves us with hacking our query compiler to add the
MATERIALIZED keyword only if it's running on 12 or newer, which would be
possible but is pretty cumbersome.

However, an alternative would be to backport the new syntax to some
earlier versions.  "WITH ... AS MATERIALIZED" can easily just be
synonymous with "WITH ... AS" in versions prior to 12; there's no need
to support "NOT MATERIALIZED" since that's explicitly requesting the new
query-folding feature that only exists in 12.  Would something like the
attached patch against REL_11_STABLE be acceptable?  I'd like to
backpatch it at least as far as PostgreSQL 10.

This compiles and passes regression tests.

Thanks,

-- 
Colin Watson[cjwat...@canonical.com]
>From 063186eb678ad9831961d6319f7a4279f1029358 Mon Sep 17 00:00:00 2001
From: Colin Watson 
Date: Fri, 18 Oct 2019 14:08:11 +0100
Subject: [PATCH] Backport "WITH ... AS MATERIALIZED" syntax

Applications that deliberately use CTEs as optimization fences need to
adjust their code to prepare for PostgreSQL 12.  Unfortunately, the
MATERIALIZED keyword that they need to add isn't valid syntax in earlier
versions of PostgreSQL, so they're stuck with either upgrading the
application and the database simultaneously, accepting performance
degradation between the two parts of the upgrade, or doing complex query
compiler work to add MATERIALIZED conditionally.

It makes things much easier in these cases if the MATERIALIZED keyword
is accepted and ignored in earlier releases.  Users can then upgrade to
a suitable point release, change their application code to add
MATERIALIZED, and then upgrade to PostgreSQL 12.
---
 doc/src/sgml/queries.sgml   | 12 ++
 doc/src/sgml/ref/select.sgml| 18 +-
 src/backend/parser/gram.y   | 12 +++---
 src/test/regress/expected/subselect.out | 31 +
 src/test/regress/sql/subselect.sql  | 14 +++
 5 files changed, 83 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 88bc189646..cc33d92133 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2215,6 +2215,18 @@ SELECT n FROM t LIMIT 100;
rows.)
   
 
+  
+   In some cases, PostgreSQL 12 folds
+   WITH queries into the parent query, allowing joint
+   optimization of the two query levels.  You can override that decision by
+   specifying MATERIALIZED to force separate calculation
+   of the WITH query.  While versions of
+   PostgreSQL before 12 do not support folding of
+   WITH queries, specifying
+   MATERIALIZED is permitted to ease application
+   upgrades.
+  
+
   
The examples above only show WITH being used with
SELECT, but it can be attached in the same way to
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 4db8142afa..1bd711a3cb 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -72,7 +72,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionand with_query is:
 
-with_query_name [ ( column_name [, ...] ) ] AS ( select | values | insert | update | delete )
+with_query_name [ ( column_name [, ...] ) ] AS [ MATERIALIZED ] ( select | values | insert | update | delete )
 
 TABLE [ ONLY ] table_name [ * ]
 
@@ -290,6 +290,17 @@ TABLE [ ONLY ] table_name [ * ]
 row, the results are unspecified.

 
+   
+PostgreSQL 12 folds side-effect-free
+WITH queries into the primary query in some cases.
+To override this and retain the behaviour up to
+PostgreSQL 11, mark the
+WITH query as MATERIALIZED.  That
+might be useful, for example, if the WITH query is
+being used as an optimization fence to prevent the planner from choosing
+a bad plan.
+   
+

 See  for additional information.

@@ -2087,6 +2098,11 @@ SELECT distributors.* WHERE distributors.name = 'Westward';

 ROWS FROM( ... ) is an extension of the SQL standard.

+
+   
+The MATERIALIZED option of WITH is
+an extension of the SQL standard.
+   
   
 
  
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bc65319c2c..70df09f409 100644
--- a/src/backend/parser/gram.y
+++ b/src/ba

Re: Backport "WITH ... AS MATERIALIZED" syntax to <12?

2019-10-19 Thread Colin Watson
On Sat, Oct 19, 2019 at 05:01:04AM +0100, Andrew Gierth wrote:
> >>>>> "Michael" == Michael Paquier  writes:
>  > On Fri, Oct 18, 2019 at 02:21:30PM +0100, Colin Watson wrote:
>  >> However, an alternative would be to backport the new syntax to some
>  >> earlier versions. "WITH ... AS MATERIALIZED" can easily just be
>  >> synonymous with "WITH ... AS" in versions prior to 12; there's no
>  >> need to support "NOT MATERIALIZED" since that's explicitly
>  >> requesting the new query-folding feature that only exists in 12.
>  >> Would something like the attached patch against REL_11_STABLE be
>  >> acceptable? I'd like to backpatch it at least as far as PostgreSQL
>  >> 10.
> 
>  Michael> I am afraid that new features don't gain a backpatch. This is
>  Michael> a project policy. Back-branches should just include bug fixes.
> 
> I do think an argument can be made for making an exception in this
> particular case. This wouldn't be backpatching a feature, just accepting
> and ignoring some of the new syntax to make upgrading easier.

Right, this is my position too.  I'm explicitly not asking for
backpatching of the CTE-inlining feature, just trying to cope with the
fact that we now have to spell some particular queries differently to
retain the performance characteristics we need for them.

I suppose an alternative would be to add a configuration option to 12
that allows disabling inlining of CTEs cluster-wide: we could then
upgrade to 12 with inlining disabled, add MATERIALIZED to the relevant
queries, and then re-enable inlining.  But I like that less because it
would end up leaving cruft around in PostgreSQL's configuration code
somewhat indefinitely for the sake of an edge case in upgrading to a
particular version.

-- 
Colin Watson[cjwat...@canonical.com]