> On Thu, Oct 10, 2024 at 05:36:10PM GMT, Ashutosh Bapat wrote: > > 3. With just patch 0001 applied, planning time usually shows > degradation (column Q and X in planning time sheets) with or without > PWJ enabled. I first thought that it might be because of the increased > size of PlannerInfo. We had seen a similar phenomenon when adding a > new member to WindowAggState [2]. Hence I introduced patch 0002 which > moves two fields around to not increase the size of structure. But > that doesn't fix the regression in the planning time (columns R and > Y). Apart from increasing the PlannerInfo size and may be object file > size, 0002 does not have any other impact. But the regression seen > with just that patch is more than what we saw in [2]. More investigate > is required to decide whether this regression is real or not and if > real, the root cause. Looking at the numbers, it seems that this > regression is causing the planning time regression in rest of the > patches. If we fix regression by 0001, we should not see much > regression in rest of the patches. I am looking for some guidance in > investigating this regression.
Hi, I've tried to reproduce some subset of those results, in case if I would be able to notice anything useful. Strangely enough, I wasn't able to get much boost in planning time e.g. with 4 first patches, 100 partitions and 5 joins -- the results you've posted are showing about 16% in that case, where I'm getting only a couple of percents. Probably I'm doing something differently, but it's turned out to be hard to reconstruct (based only on this thread) how did you exactly benchmark the patch -- could you maybe summarize the benchmark in a reproducible way? >From what I understand you were testing againt an empty partitioned table. Here is what I was doing: create table t1p (c1 int) partition by list(c1); select format('create table %I partition of t1p for values in (%s)', 't1p' || i, i) from generate_series(1, 100) i; \gexec do $x$ declare i record; plan float[]; plan_line text; begin for i in select * from generate_series(1, 1000) i loop for plan_line in execute format($y$ explain analyze select * from t1p t1, t1p t2, t1p t3, t1p t4, t1p t5 where t2.c1 = t1.c1 and t3.c1 = t2.c1 and t4.c1 = t3.c1 and t5.c1 = t4.c1 $y$) loop if plan_line like '%Planning Time%' then plan := array_append(plan, substring(plan_line from '\d+.\d+')::float); end if; end loop; end loop; -- skip the first record as prewarming raise warning 'avg: %', (select avg(v) from unnest(plan[2:]) v); raise warning 'median: %', (select percentile_cont(0.5) within group(order by v) from unnest(plan[2:]) v); end; $x$; As a side note, may I ask to attach benchmark results as files, rather than a link to a google spreadsheet? It feels nice to have something static to work with.