I found yet another trick, which actually seems to be slightly faster than the 
plpgsql version.

The trick is to use VIEW ... WITH (security_barrier) to tell the optimizer it 
shouldn’t flatten the subqueries.

CREATE TABLE eastern (year integer);

INSERT INTO eastern (year) SELECT generate_series(1,100000);

CREATE VIEW v0 WITH (security_barrier) AS (
  SELECT
    year,
    year % 19 AS g,
    year / 100 AS c
  FROM eastern
);
CREATE VIEW v1 WITH (security_barrier) AS (
  SELECT
    year,
    g,
    c,
    (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h
  FROM v0
);
CREATE VIEW v2 WITH (security_barrier) AS (
  SELECT
    year,
    c,
    h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i
  FROM v1
);
CREATE VIEW v3 WITH (security_barrier) AS (
  SELECT
    year,
    i,
    (year + year/4 + i + 2 - c + c/4) % 7 AS j
  FROM v2
);
CREATE VIEW v4 WITH (security_barrier) AS (
  SELECT
    year,
    i - j AS p
  FROM v3
);
CREATE VIEW v5 WITH (security_barrier) AS (
  SELECT
    year,
    3 + (p + 26)/30 AS easter_month,
    1 + (p + 27 + (p + 6)/40) % 31 AS easter_day
  FROM v4
);

EXPLAIN ANALYZE VERBOSE
SELECT make_date(year, easter_month, easter_day) FROM v5;

                                                                        QUERY 
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on v5  (cost=0.00..17943.00 rows=100000 width=4) (actual 
time=0.025..213.996 rows=100000 loops=1)
   Output: make_date(v5.year, v5.easter_month, v5.easter_day)
   ->  Subquery Scan on v4  (cost=0.00..16693.00 rows=100000 width=12) (actual 
time=0.024..191.448 rows=100000 loops=1)
         Output: v4.year, (3 + ((v4.p + 26) / 30)), (1 + (((v4.p + 27) + ((v4.p 
+ 6) / 40)) % 31))
         ->  Subquery Scan on v3  (cost=0.00..13443.00 rows=100000 width=8) 
(actual time=0.022..155.690 rows=100000 loops=1)
               Output: v3.year, (v3.i - v3.j)
               ->  Subquery Scan on v2  (cost=0.00..12193.00 rows=100000 
width=12) (actual time=0.021..135.485 rows=100000 loops=1)
                     Output: v2.year, v2.i, ((((((v2.year + (v2.year / 4)) + 
v2.i) + 2) - v2.c) + (v2.c / 4)) % 7)
                     ->  Subquery Scan on v1  (cost=0.00..9193.00 rows=100000 
width=12) (actual time=0.019..97.935 rows=100000 loops=1)
                           Output: v1.year, v1.c, (v1.h - ((v1.h / 28) * (1 - 
(((v1.h / 28) * (29 / (v1.h + 1))) * ((21 - v1.g) / 11)))))
                           ->  Subquery Scan on v0  (cost=0.00..5443.00 
rows=100000 width=16) (actual time=0.017..57.988 rows=100000 loops=1)
                                 Output: v0.year, v0.g, v0.c, (((((v0.c - (v0.c 
/ 4)) - (((8 * v0.c) + 13) / 25)) + (19 * v0.g)) + 15) % 30)
                                 ->  Seq Scan on public.eastern  
(cost=0.00..1943.00 rows=100000 width=12) (actual time=0.015..23.908 
rows=100000 loops=1)
                                       Output: eastern.year, (eastern.year % 
19), (eastern.year / 100)
Planning Time: 0.274 ms
Execution Time: 220.698 ms
(16 rows)

EXPLAIN ANALYZE VERBOSE
SELECT easter_plpgsql(year) FROM eastern;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.eastern  (cost=0.00..26443.00 rows=100000 width=4) (actual 
time=0.077..301.519 rows=100000 loops=1)
   Output: easter_plpgsql(year)
Planning Time: 0.049 ms
Execution Time: 309.119 ms
(4 rows)


On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:
> "Joel Jacobson" <j...@compiler.org> writes:
> > I think I was a bit unclear about my problem, and might have used the wrong 
> > terminology.
> > In my LATERAL query, there are calculations in a certain order.
> > For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> > However, when looking at the query plan, these steps are gone, and instead 
> > there is just one huge fully expanded expression, which doesn't look very 
> > efficient.
> 
> Yeah, this isn't really about function inlining, it's about subquery
> flattening (which is similar in some ways, but not the same thing).
> 
> Unfortunately, subquery flattening happens early enough in the planner
> that there's no chance of making any useful cost comparisons to decide
> whether to do it or not.  So we just do it unconditionally.  I'm
> not really sure that failing to do it would provide a better outcome
> in this situation anyway --- sure, you'd save a few scalar calculations,
> but the overhead of running additional plan nodes could outweigh that.
> 
> The long and the short of it is that SQL isn't terribly well suited to
> execute a fundamentally stepwise, imperative algorithm like this one.
> Rather than hacking up cute tricks with LATERAL, you should just use
> a language that *is* well suited.  That's why we provide PLs.
> 
> FWIW, another trick for inserting optimization fences is WITH.
> So you could do something like
> 
> WITH Q1(g,c) AS MATERIALIZED
>   (SELECT year % 19, year / 100),
> Q2(h) AS MATERIALIZED
>   (SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
> ...
> SELECT make_date(year, easter_month, easter_day) FROM Q6;
> 
> But I'd bet lunch that that won't be faster for this example,
> because there's a lot of overhead in CTEs.
> 
> regards, tom lane
> 

Kind regards,

Joel

Reply via email to