Hello Thomas,

27.08.2024 09:52, Thomas Munro wrote:
Here's a really simple way to see the new unfairness at the end of a
parallel scan:

drop table if exists t;
create table t (i int);
insert into t select generate_series(1, 100000);
alter table t set (parallel_workers = 2);
set parallel_setup_cost = 0;
set parallel_leader_participation = off;
explain (analyze, buffers, verbose) select count(*) from t;

On my machine, unpatched master shows:

                      Worker 0:  actual time=0.036..12.452 rows=51076 loops=1
                        Buffers: shared hit=226
                      Worker 1:  actual time=0.037..12.003 rows=48924 loops=1
                        Buffers: shared hit=217

The attached patch, which I'd like to push, is effectively what
Alexander tested (blocknums[16] -> blocknums[1]).  There's no point in
using an array of size 1, so I've turned it into a simple variable and
deleted the relevant comments.  My machine shows:

                      Worker 0:  actual time=0.038..12.115 rows=49946 loops=1
                        Buffers: shared hit=221
                      Worker 1:  actual time=0.038..12.109 rows=50054 loops=1
                        Buffers: shared hit=222

That difference may not seem huge, but other pre-existing things are
going pathologically wrong in the reported query that magnify it (see
my earlier analysis).  It's an interesting problem that will require
more study (my earlier analysis missed a detail that I'll write about
separately), but it doesn't seem to be new or have easy fixes, so that
will have to be for later work.

I've tried your query and could not get sustainable results, unfortunately.
The following script:
rm -rf "$PGDATA"; initdb -D "$PGDATA" >initdb.log 2>&1

pg_ctl -s -l server.log start

cat << EOF | psql | grep 'Parallel Seq Scan' -A10 | grep 'Worker ' -A1
create table t (i int);
insert into t select generate_series(1, 100000);
alter table t set (parallel_workers = 2);
set parallel_setup_cost = 0;
set parallel_leader_participation = off;
explain (analyze, buffers, verbose) select count(*) from t;
EOF

pg_ctl -s stop

gives me unstable numbers on unpatched master:
                     Worker 0:  actual time=0.024..5.814 rows=51076 loops=1
                       Buffers: shared hit=226
                     Worker 1:  actual time=0.023..5.614 rows=48924 loops=1
                       Buffers: shared hit=217
---
                     Worker 0:  actual time=0.027..5.130 rows=36612 loops=1
                       Buffers: shared hit=162
                     Worker 1:  actual time=0.013..5.605 rows=63388 loops=1
                       Buffers: shared hit=281
---
                     Worker 0:  actual time=0.025..5.447 rows=47460 loops=1
                       Buffers: shared hit=210
                     Worker 1:  actual time=0.019..5.688 rows=52540 loops=1
                       Buffers: shared hit=233

and also with the patch applied:
                     Worker 0:  actual time=0.012..4.486 rows=55478 loops=1
                       Buffers: shared hit=246
                     Worker 1:  actual time=0.014..4.430 rows=44522 loops=1
                       Buffers: shared hit=197
---
                     Worker 0:  actual time=0.013..4.269 rows=55822 loops=1
                       Buffers: shared hit=247
                     Worker 1:  actual time=0.017..4.238 rows=44178 loops=1
                       Buffers: shared hit=196
---
                     Worker 0:  actual time=0.014..4.974 rows=50624 loops=1
                       Buffers: shared hit=224
                     Worker 1:  actual time=0.016..4.932 rows=49376 loops=1
                       Buffers: shared hit=219
---
                     Worker 0:  actual time=0.012..5.459 rows=65648 loops=1
                       Buffers: shared hit=291
                     Worker 1:  actual time=0.022..5.109 rows=34352 loops=1
                       Buffers: shared hit=152

Please correct me, if I'm doing something wrong.

Best regards,
Alexander


Reply via email to