Re: Improve the performance of nested loop join in the case of partitioned inner table

Alexandr Nikulin Wed, 12 Apr 2023 08:00:40 -0700

The following tests demonstrate the speedup which may be achieved with my
patch:


1. Add to postgresql.conf
enable_hashjoin = OFF
enable_mergejoin = OFF
enable_material = OFF
enable_bitmapscan = OFF
enable_nestloop = ON
max_parallel_workers_per_gather = 0
enable_memoize = OFF

2. create test tables:

create table test_part(id int primary key) partition by range(id);
create table test_part0 partition of test_part for values from (0) to
(1000000);
create table test_part1 partition of test_part for values from (1000000) to
(2000000);
insert into test_part select id from generate_series(0, 1000000-1) as id;

create table ids(id int, name varchar); create index on ids(ascii(name));
insert into ids select id, 'worst case' as name from generate_series(0,
1000000-1) as id;
insert into ids select 123456, 'best case' as name from generate_series(0,
1000000-1) as id;

3. run tests:

explain analyze select * from ids join test_part on ids.id=test_part.id
where ascii(ids.name)=ascii('best case');
explain analyze select * from ids join test_part on ids.id=test_part.id
where ascii(ids.name)=ascii('worst case');

The average results on my machine are as follows:

             | vanila postgres | patched postgres
best case    | 2286 ms         | 1924 ms
worst case   | 2278 ms         | 2360 ms

So far I haven't refactored the patch as per David's advice. I just want to
understand if we need such an optimization?


чт, 23 мар. 2023 г. в 17:05, David Rowley <dgrowle...@gmail.com>:

> On Thu, 23 Mar 2023 at 19:46, Alexandr Nikulin
> <alexandr.s.niku...@gmail.com> wrote:
> > I propose to slightly improve the performance of nested loop join in the
> case of partitioned inner table.
> > As I see in the code, the backend looks for the partition of the inner
> table each time after fetch a new row from the outer table.
> > These searches can take a significant amount of time.
> > But we can skip this step if the nested loop parameter(s) was(re) not
> changed since the previous row fetched from the outer table
>
> I think if we were to do something like that, then it should be done
> in nodeAppend.c and nodeMergeAppend.c. That way you can limit it to
> only checking parameters that partition pruning needs to care about.
> That does mean you'd need to find somewhere to cache the parameter
> values, however. Doing it in nodeNestloop.c means you're doing it when
> the inner subplan is something that does not suffer from the
> additional overhead you want to avoid, e.g an Index Scan.
>
> Also, generally, if you want to get anywhere with a performance patch,
> you should post performance results from before and after your change.
> Also include your benchmark setup and relevant settings for how you
> got those results.  For this case, you'll want a best case (parameter
> value stays the same) and a worst case, where the parameter value
> changes on each outer row.  I expect you're going to add overhead to
> this case as your additional checks will always detect the parameter
> has changed as that'll always require partition pruning to be executed
> again. We'll want to know if that overhead is high enough for us not
> to want to do this.
>
> I'll be interested to see a test that as some varlena parameter of say
> a few hundred bytes to see how much overhead testing if that parameter
> has changed when the pruning is being done on a HASH partitioned
> table.  HASH partitioning should prune quite a bit faster than both
> LIST and RANGE as the hashing is effectively O(1) vs O(log2 N) (N
> being the number of Datums in the partition bounds). I'd expect a
> measurable additional overhead with the patch when the parameter
> changes on each outer row.
>
> David
>

Re: Improve the performance of nested loop join in the case of partitioned inner table

Reply via email to