David Rowley <dgrowle...@gmail.com> 於 2024年7月5日週五 上午10:15寫道:

> On Fri, 5 Jul 2024 at 12:50, James Pang <jamespang...@gmail.com> wrote:
> >    we have a daily vacuumdb and analyze job, generally speaking it's
> done in seconds, sometimes it suddenly running more than tens of minutes
> with same bind variable values and huge temp space got used  and at that
> time, explain show  "Hash Anti join, Hash Right join" with seq scan two
> tables.
>
> There was talk about adding costing for run-time partition pruning
> factors but nothing was ever agreed, so nothing was done.  It's just
> not that obvious to me how we'd do that.  If the Append had 10
> partitions as subnodes, with an equality join condition, you could
> assume we'll only match to 1 of those 10, but we've no idea at plan
> time which one that'll be and the partitions might drastically vary in
> size.  The best I think we could do is take the total cost of those 10
> and divide by 10 to get the average cost.  It's much harder for range
> conditions as those could match anything from 0 to all partitions. The
> best suggestion I saw for that was to multiply the costs by
> DEFAULT_INEQ_SEL.
>
> I think for now, you might want to lower the random_page_cost or
> increase effective_cache_size to encourage the nested loop -> index
> scan plan. Good ranges for effective_cache_size is anywhere between 50
> - 75% of your servers's RAM.  However, that might not be ideal if your
> server is under memory pressure from other running processes. It also
> depends on how large shared_buffers are as a percentage of total RAM.
>
> David
>

  We already random_page_cost=1.1 and effective_cache_size=75% physical
memory in this database server. For this SQL,
       ->  Nested Loop Anti Join  (cost=40.32..132168227.57 rows=224338
width=78)
               Join Filter: (lower((p.ctinfo)::text) =
lower((w.ctinfo)::text))
               ->  Nested Loop Left Join
 (cost=39.63..398917.29   rows=299118 width=78)
                     ->  Append  (cost=0.56..22.36 rows=8 width=54)
                           ->  Index Scan using
wmdata_p0_llid_hhid_stime_idx on wmdata_p0 m_1  (cost=0.5
6..2.79 rows=1 width=54)
                        ....
                     ->  Append  (cost=39.07..49312.09 rows=54978 width=78)
                           ->  Bitmap Heap Scan on wmvtee_p0 w.1
 (cost=39.07..1491.06 rows=1669 width=78)
                                 Recheck Cond: ((m.partitionkeyid)::text =
(partitionkeyid)::text)
                                 ->  Bitmap Index Scan on
wmvtee_p0_partitionkeyid_intid_idx  (cost=0.00..38.65 rows=1669 width=0)
                                       Index Cond: ((partitionkeyid)::text
= (m.partitionkeyid)::text)
                                 ...
               ->  Append  (cost=0.69..516.96 rows=4010 width=78)
                     ->  Index Only Scan using
wmpct_p0_partitionkeyid_ctinfo_idx on wmpct_p0 p_1  (cost=0.
69..15.78 rows=124 width=78)
                       ...

           for nest loop path, since the first one estimated only "8" rows
, and they use partitionkeyid as joinkey and all are hash partitions , is
it better to estimate cost to  8 (loop times) * 1600 = 12800 (each one
loop  map to only 1 hash partition bitmap scan ,avg one partition cost),
that's much less than 398917.29 of all partitions ?  for secondary Nest
Loop Anti join could be rows 299118 rows *  15.78(avg index scan cost of
one partition) = 4,720,082 that still much less than  132168227.57 ?
         for Hash Right join, is it possible to estimate by  8 seq
partition scan instead of all 32 hash partitions since the first query
estimated 8 rows only ?
         extend statistics may help estimate count(partitionkeyid) based on
other columns bind variables, but looks like that did not help table join
case.

Thanks,

James

Reply via email to