On 2024-06-17 17:24, Tom Lane wrote:
Julius Tuskenis<julius.tuske...@gmail.com> writes:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT
COALESCE(sum(mok_nepadengta), 0)
FROM
public.b_pardavimai
JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
WHERE
(pard_tipas = ANY('{1, 2, 6, 7}'))
AND (mok_saskaita = 7141968)
I believe that the SQL-language function executor always uses generic
plans for parameterized queries (which is bad, but nobody's gotten
round to improving it). So the above is a poor way of investigating
what will happen, because it corresponds to a custom plan for the
value 7141968. You should try something like
PREPARE p(integer) AS
SELECT COALESCE ...
... AND (mok_saskaita = $1);
SET plan_cache_mode TO force_generic_plan;
EXPLAIN ANALYZE EXECUTE p(7141968);
What I suspect is that the statistics for mok_saskaita are
highly skewed and so with a generic plan the planner will
not risk using a plan that depends on the parameter value
being infrequent, as the one you're showing does.
regards, tom lane
Thank you Tom Lane, for pointing the problem.
In deed, after setting plan_cache_mode to force_generic_plan I see very
different plan:
```
"Finalize Aggregate (cost=6901.01..6901.02 rows=1 width=32) (actual
time=50.258..56.004 rows=1 loops=1)"
" Output: COALESCE(sum((b_mokejimai.mok_nepadengta)::numeric),
'0'::numeric)"
" Buffers: shared hit=4"
" -> Gather (cost=6900.89..6901.00 rows=1 width=32) (actual
time=0.809..55.993 rows=2 loops=1)"
" Output: (PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric))"
" Workers Planned: 1"
" Workers Launched: 1"
" Buffers: shared hit=4"
" -> Partial Aggregate (cost=5900.89..5900.90 rows=1 width=32)
(actual time=0.077..0.079 rows=1 loops=2)"
" Output: PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric)"
" Buffers: shared hit=4"
" Worker 0: actual time=0.052..0.053 rows=1 loops=1"
" -> Nested Loop (cost=25.92..5897.69 rows=1280 width=3)
(actual time=0.070..0.072 rows=0 loops=2)"
" Output: b_mokejimai.mok_nepadengta"
" Inner Unique: true"
" Buffers: shared hit=4"
" Worker 0: actual time=0.043..0.043 rows=0 loops=1"
" -> Parallel Bitmap Heap Scan on
public.b_mokejimai (cost=25.48..2455.36 rows=1307 width=7) (actual
time=0.069..0.070 rows=0 loops=2)"
" Output: b_mokejimai.mok_id,
b_mokejimai.mok_moketojas, b_mokejimai.mok_pardavimas,
b_mokejimai.mok_laikas, b_mokejimai.mok_suma, b_mokejimai.mok_budas,
b_mokejimai.mok_terminas, b_mokejimai.mok_cekis,
b_mokejimai.mok_saskaita, b_mokejimai.mok_suma_bazine,
b_mokejimai.mok_nepadengta, b_mokejimai.mok_padengta,
b_mokejimai.mok_laiko_diena"
" Recheck Cond: (b_mokejimai.mok_saskaita = $1)"
" Buffers: shared hit=4"
" Worker 0: actual time=0.042..0.042 rows=0
loops=1"
" -> Bitmap Index Scan on idx_saskaita
(cost=0.00..24.93 rows=2222 width=0) (actual time=0.023..0.023 rows=0
loops=1)"
" Index Cond: (b_mokejimai.mok_saskaita =
$1)"
" Buffers: shared hit=4"
" -> Index Scan using pk_b_pardavimai_id on
public.b_pardavimai (cost=0.44..2.63 rows=1 width=4) (never executed)"
" Output: b_pardavimai.pard_id,
b_pardavimai.pard_preke, b_pardavimai.pard_kaina,
b_pardavimai.pard_nuolaida, b_pardavimai.pard_kiekis,
b_pardavimai.pard_kasos_nr, b_pardavimai.pard_laikas,
b_pardavimai.pard_prekes_id, b_pardavimai.pard_pirkejo_id,
b_pardavimai.pard_pardavejas, b_pardavimai.pard_spausdinta,
b_pardavimai.pard_reikia_grazinti, b_pardavimai.pard_kam_naudoti,
b_pardavimai.pard_susieta, b_pardavimai.pard_galima_anuliuoti,
b_pardavimai.pard_tipas, b_pardavimai.pard_pvm,
b_pardavimai.pard_apsilankymas, b_pardavimai.pard_fk,
b_pardavimai.pard_kelintas, b_pardavimai.pard_precekis,
b_pardavimai.pard_imone, b_pardavimai.pard_grazintas,
b_pardavimai.pard_debeto_sutartis, b_pardavimai.pard_kaina_be_nld,
b_pardavimai.pard_uzsakymas_pos, b_pardavimai.pard_pvm_suma,
b_pardavimai.pard_uzsakymo_nr, b_pardavimai.pard_nuolaidos_id,
b_pardavimai.pard_nuolaida_taikyti,
b_pardavimai.pard_pirkeja_keisti_galima,
b_pardavimai.pard_suma_keisti_galima"
" Index Cond: (b_pardavimai.pard_id =
b_mokejimai.mok_pardavimas)"
" Filter: (b_pardavimai.pard_tipas = ANY
('{1,2,6,7}'::integer[]))"
"Planning Time: 0.016 ms"
"Execution Time: 56.097 ms"
```
If I understand the plan correctly, the problem is the planner expects
to find 2222 records for a provide value of `mok_saskaita`. I've tried
running analyze on `b_mokejimai`, but the plan remains the same - must
be because some values of `mok_saskaita` do really return tens of
thousands of records.
I don't know how the planner comes up with value 2222, because on
average there are 15 b_mokejimai records for a single mok_saskaita (if
NULL in mok_saskata is ignored), and 628 records if not.
Anyway...
Do you think rewriting a function in plpgsql is a way to go in such
case? In pg documentation
(https://www.postgresql.org/docs/12/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING)
I read that the plan for the plpgsql function is calculated the first
time the function is executed (for a connection). I'm concerned, that
the function execution is not replanned: I will be stuck with a plan
that corresponds to the `mok_saskaita` parameter value passed on the
first execution. Or am I wrong?
Is there a way to make PostgreSQL recalculate the plan on each execution
of the function? The observed planning times are acceptable for my
application.
Regards,
Julius Tuskenis