Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

Amit Langote Fri, 14 Mar 2025 05:07:13 -0700

Hi Ashutosh,

On Tue, Feb 25, 2025 at 8:04 PM Ashutosh Bapat
<ashutosh.bapat....@gmail.com> wrote:
> On Thu, Feb 20, 2025 at 5:28 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> 
> wrote:
> > On Tue, Feb 4, 2025 at 4:07 PM Ashutosh Bapat 
> > <ashutosh.bapat....@gmail.com> wrote:
> > > If we are not interested in saving memory, there is a simpler way to
> > > improve planning time by adding a hash table per equivalence class to
> > > store the derived clauses, instead of a linked list, when the number
> > > of derived clauses is higher than a threshold (say 32 same as the
> > > threshold for join_rel_list. Maybe that approach will yield stable
> > > planning time.
>
> I implemented the above idea in attached patches.  I also added the
> following query, inspired from Alvaro's query, to summarise the
> results.
> with master_avgs as
> (select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
> avg_pt, stddev(planning_time_ms) stddev_pt
> from msmts where code_tag = 'master'
> group by code_tag, num_parts, num_joins, pwj),
> patched_avgs as
> (select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
> avg_pt, stddev(planning_time_ms) stddev_pt
> from msmts where code_tag = 'patched'
> group by code_tag, num_parts, num_joins, pwj)
> select num_parts,
> num_joins,
> format('s=%s%% md=%s%% pd=%s%%',
> ((m.avg_pt - p.avg_pt)/m.avg_pt * 100)::numeric(6, 2),
> (m.stddev_pt/m.avg_pt * 100)::numeric(6, 2),
> (p.stddev_pt/p.avg_pt * 100)::numeric(6, 2))
> from master_avgs m join patched_avgs p using (num_parts, num_joins,
> pwj) where not pwj order by 1, 2, 3;
> \crosstabview 1 2 3
>
> not pwj in the last line should be changed to pwj to get results with
> enable_partitionwise_join = true.
>
> With the attached patches, I observe following results
>
> With PWJ disabled
>  num_parts |               2               |              3
>    |             4              |              5
> -----------+-------------------------------+------------------------------+----------------------------+-----------------------------
>          0 | s=-4.44% md=17.91% pd=23.05%  | s=-0.83% md=11.10%
> pd=19.58% | s=0.87% md=4.04% pd=7.91%  | s=-35.24% md=7.63% pd=9.69%
>         10 | s=30.13% md=118.18% pd=37.44% | s=-3.49% md=0.58%
> pd=0.49%   | s=-0.83% md=0.29% pd=0.35% | s=-0.24% md=0.35% pd=0.32%
>        100 | s=1.94% md=13.19% pd=4.08%    | s=-0.27% md=0.18%
> pd=0.44%   | s=7.04% md=3.05% pd=3.11%  | s=12.75% md=1.69% pd=0.81%
>        500 | s=4.39% md=1.71% pd=1.33%     | s=10.17% md=1.28%
> pd=1.90%   | s=23.04% md=0.24% pd=0.58% | s=30.87% md=0.30% pd=1.11%
>       1000 | s=4.27% md=1.21% pd=1.97%     | s=13.97% md=0.44%
> pd=0.79%   | s=24.05% md=0.63% pd=1.02% | s=30.77% md=0.77% pd=0.17%
>
>
> Each cell is a triple (s, md, pd) where s is improvement in planning
> time using the patches in % as compared to the master (higher the
> better), md = standard deviation as % of the average planning time on
> master, pd = is standard deviation as % of the average planning time
> with patches.
>
> With PWJ enabled
>  num_parts |              2               |              3
>   |              4              |              5
> -----------+------------------------------+------------------------------+-----------------------------+------------------------------
>          0 | s=-94.25% md=6.98% pd=56.03% | s=44.10% md=141.13%
> pd=9.32% | s=42.71% md=46.00% pd=6.55% | s=-26.12% md=6.72% pd=15.20%
>         10 | s=-25.89% md=4.29% pd=63.75% | s=-1.34% md=3.15% pd=3.26%
>   | s=0.31% md=4.13% pd=4.34%   | s=-1.34% md=3.10% pd=6.73%
>        100 | s=-2.83% md=0.94% pd=1.31%   | s=-2.17% md=4.57% pd=4.41%
>   | s=0.98% md=1.59% pd=1.81%   | s=1.87% md=1.10% pd=0.79%
>        500 | s=1.57% md=3.01% pd=1.70%    | s=6.99% md=1.58% pd=1.68%
>   | s=11.11% md=0.24% pd=0.62%  | s=11.65% md=0.18% pd=0.90%
>       1000 | s=3.59% md=0.98% pd=1.78%    | s=10.83% md=0.88% pd=0.46%
>   | s=15.62% md=0.46% pd=0.13%  | s=16.38% md=0.63% pd=0.29%
>
> Same numbers measured for previous set of patches [1], which improves
> both memory consumption as well as planning time.
>
> With PWJ disabled
>  num_parts |               2               |              3
>    |              4               |               5
> -----------+-------------------------------+------------------------------+------------------------------+--------------------------------
>          0 | s=4.68% md=18.17% pd=22.09%   | s=-2.54% md=12.00%
> pd=13.81% | s=-2.02% md=3.84% pd=4.43%   | s=-69.14% md=11.06%
> pd=126.87%
>         10 | s=-24.85% md=20.42% pd=35.69% | s=-4.31% md=0.73%
> pd=1.53%   | s=-14.97% md=0.32% pd=31.90% | s=-0.57% md=0.79% pd=0.50%
>        100 | s=0.27% md=4.69% pd=1.55%     | s=4.16% md=0.29% pd=0.18%
>    | s=11.76% md=0.85% pd=0.49%   | s=15.76% md=1.64% pd=2.32%
>        500 | s=0.54% md=1.88% pd=1.81%     | s=9.36% md=1.17% pd=0.87%
>    | s=21.45% md=0.74% pd=0.88%   | s=30.47% md=0.17% pd=1.17%
>       1000 | s=3.22% md=1.36% pd=0.99%     | s=14.74% md=0.86%
> pd=0.44%   | s=24.50% md=0.36% pd=0.31%   | s=27.97% md=0.27% pd=0.25%
>
> With PWJ enabled
>  num_parts |              2              |             3
> |             4              |              5
> -----------+-----------------------------+----------------------------+----------------------------+-----------------------------
>          0 | s=11.07% md=19.28% pd=8.70% | s=-1.18% md=5.88% pd=4.31%
> | s=-2.25% md=8.42% pd=3.77% | s=25.07% md=11.48% pd=3.87%
>         10 | s=-9.07% md=2.65% pd=14.58% | s=0.55% md=3.10% pd=3.41%
> | s=3.89% md=3.94% pd=3.79%  | s=7.25% md=2.87% pd=3.03%
>        100 | s=-4.53% md=0.49% pd=8.53%  | s=2.24% md=4.24% pd=3.96%
> | s=6.70% md=1.30% pd=2.08%  | s=9.09% md=1.39% pd=1.50%
>        500 | s=-1.65% md=1.59% pd=1.44%  | s=6.31% md=0.89% pd=1.11%
> | s=12.72% md=0.20% pd=0.29% | s=15.02% md=0.28% pd=0.83%
>       1000 | s=1.53% md=1.01% pd=1.66%   | s=11.80% md=0.66% pd=0.71%
> | s=16.23% md=0.58% pd=0.18% | s=17.16% md=0.67% pd=0.68%
>
> There are a few things to notice
> 1. There is not much difference in the planning time improvement by
> both the patchsets. But the patchset attached to [1] improves memory
> consumption as well. So it looks more attractive.
> 2. The performance regressions usually coincide with higher standard
> deviation. This indicates that both the performance gains or losses
> seen at lower numbers of partitions and joins are not real and
> possibly ignored. I have run the script multiple times but some or
> other combination of lower number of partitions and lower number of
> joins shows higher deviation and thus unstable results. I have not be
> able to find a way where all the combinations show a stable result.


Thanks for the patch and the extensive benchmarking.

Would you please also share a simple, self-contained example that I
can use to reproduce and verify the performance improvements? It’s
helpful to first see the patch in action with a representative query,
and then refer to the more extensive benchmark results you've already
shared as needed. I'm also having a hard time telling from the scripts
which query was used to produce the numbers in the report.

Btw, in the commit message, you mention:

===
When there are thousands of partitions in a partitioned table, there
can be thousands of derived clauses in the list making it inefficient
for a lookup.
===

I haven’t found a description of how that many clauses end up in the
ec->ec_derived list. IIUC, it's create_join_clause() where the child
clauses get added, and it would be helpful to mention that, since that
also appears to be the hotspot your patch is addressing.

> I think the first patch in the attached set is worth committing, just
> to tighten things up.

I agree and am happy to commit it if there are no objections.

-- 
Thanks, Amit Langote

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

Reply via email to