Re: effect on planner of turning a subquery to a table, sql function returning table

Thierry Henrio Fri, 12 Apr 2024 07:58:52 -0700

Thanks David!

Setting jit to 0:


 GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124) (actual
time=56.603..68.908 rows=4670 loops=1)
   Group Key: t.device_id, t.date, t.start_time, t.end_time,
(timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->>
1))::time without time zone))
   Buffers: shared hit=3067
   ->  Sort  (cost=401037.82..404706.32 rows=1467400 width=96) (actual
time=56.583..57.847 rows=22908 loops=1)
         Sort Key: t.device_id, t.date, t.start_time, t.end_time,
(timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->>
1))::time without time zone)), t.rank
         Sort Method: quicksort  Memory: 3229kB
         Buffers: shared hit=3067
         ->  Nested Loop  (cost=2.99..100268.62 rows=1467400 width=96)
(actual time=0.076..31.219 rows=22908 loops=1)
               Buffers: shared hit=3067
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64)
(actual time=0.055..6.589 rows=11454 loops=1)
                     Buffers: shared hit=3067
                     ->  Nested Loop  (cost=2.54..39.31 rows=500 width=40)
(actual time=0.041..0.047 rows=1 loops=1)
                           Buffers: shared hit=3
                           ->  Index Scan using campaigns_pkey on campaigns
c  (cost=0.28..8.30 rows=1 width=355) (actual time=0.011..0.012 rows=1
loops=1)
                                 Index Cond: (id = 11870)
                                 Buffers: shared hit=3
                           ->  Hash Join  (cost=2.26..26.01 rows=500
width=40) (actual time=0.028..0.031 rows=1 loops=1)
                                 Hash Cond: (EXTRACT(dow FROM d.date) =
((j.dow)::integer)::numeric)
                                 ->  Function Scan on generate_series d
 (cost=0.01..10.01 rows=1000 width=8) (actual time=0.007..0.008 rows=1
loops=1)
                                 ->  Hash  (cost=1.00..1.00 rows=100
width=64) (actual time=0.013..0.014 rows=7 loops=1)
                                       Buckets: 1024  Batches: 1  Memory
Usage: 9kB
                                       ->  Function Scan on jsonb_each j
 (cost=0.00..1.00 rows=100 width=64) (actual time=0.007..0.008 rows=7
loops=1)
                     ->  Index Scan using device_timeslots_date_index on
device_timeslots t  (cost=0.43..111.56 rows=29 width=32) (actual
time=0.013..4.282 rows=11454 loops=1)
                           Index Cond: ((date = d.date) AND (date >=
'2024-04-26'::date) AND (date <= '2024-04-26'::date))
                           Buffers: shared hit=3064
               ->  Memoize  (cost=0.01..1.01 rows=100 width=32) (actual
time=0.000..0.000 rows=2 loops=11454)
                     Cache Key: j.times
                     Cache Mode: binary
                     Hits: 11453  Misses: 1  Evictions: 0  Overflows: 0
 Memory Usage: 1kB
                     ->  Function Scan on jsonb_array_elements t_1
 (cost=0.00..1.00 rows=100 width=32) (actual time=0.002..0.002 rows=2
loops=1)
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.512 ms
 Execution Time: 69.476 ms

So as far as I understand, correct me when I'm wrong:

1) EXPLAIN (ANALYZE, BUFFERS) shows the jit timing, which mere EXPLAIN did
not thank!
I found the
https://www.postgresql.org/docs/current/jit-decision.html#JIT-DECISION

2) the subquery is overestimated, estimated 500, actual 1 in the Nested
Loop.
And it amplified by its outer Nested Loop

         ->  Nested Loop  (cost=2.99..100268.62 rows=*1467400* width=96)
(actual time=0.067..27.102 rows=*22908* loops=1)
               Buffers: shared hit=3067
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64)
(actual time=0.055..5.750 rows=11454 loops=1)
                     Buffers: shared hit=3067
                     ->  Nested Loop  (cost=2.54..39.31 rows=*500*
width=40) (actual time=0.039..0.044 rows=*1* loops=1)

The scan of the temp table was overestimated too, but it is mitigated by
the Merge Join

         ->  Merge Join  (cost=67127.99..67631.11 rows=*33165* width=64)
(actual time=7.642..13.455 rows=*22908* loops=1)
               Merge Cond: (z.date = t.date)
               Buffers: shared hit=3064, local hit=1
               ->  Sort  (cost=78.60..81.43 rows=1130 width=40) (actual
time=0.011..0.012 rows=2 loops=1)
                     Sort Key: z.date
                     Sort Method: quicksort  Memory: 25kB
                     Buffers: local hit=1
                     ->  Seq Scan on z11870 z  (cost=0.00..21.30 rows=*1130*
width=40) (actual time=0.006..0.007 rows=*2* loops=1)
                           Buffers: local hit=1

The cost estimation is better in (B) than (A) (even when the estimated rows
of z11870 are off).

So I have options : O1) set jit=0, O2) temp table, O3) rework schema design
(a variation on O2).
Thanks!


On Fri, Apr 12, 2024 at 1:40 PM David Rowley <dgrowle...@gmail.com> wrote:

> On Fri, 12 Apr 2024 at 23:27, Thierry Henrio <thierry.hen...@gmail.com>
> wrote:
> >  JIT:
> >    Functions: 36
> >    Options: Inlining true, Optimization true, Expressions true,
> Deforming true
> >    Timing: Generation 1.949 ms, Inlining 28.891 ms, Optimization 207.481
> ms, Emission 134.907 ms, Total 373.228 ms
> >  Execution Time: 429.037 ms
>
> It looks very much like the majority of the extra time is being spent
> doing JIT compilation.  This triggers for plan A but not plan B.  You
> can see from:
>
> > GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124)
> (actual time=416.851..426.534 rows=4670 loops=1)
>
> that the top-level row estimates are off.  This makes the estimated
> cost higher than it actually is.  The planner opts to have tuple
> deforming and expression evaluation JIT compiled to try to speed up
> the plan thinking it's worthwhile. It's not in this case.
>
> You can switch JIT off to try without with:
>
> SET jit=0;
>
> You might want to consider editing postgresql.conf and raising the
> jit_above_cost, jit_inline_above_cost and jit_optimize_above_cost
> values to some higher value or disable JIT completely.
>
> SELECT pg_reload_conf(); -- to reload the config file afterwards.
>
> David
>

Re: effect on planner of turning a subquery to a table, sql function returning table

Reply via email to