Re: Parallel Seq Scan vs kernel read ahead

Ranier Vilela Mon, 22 Jun 2020 09:49:04 -0700

Em seg., 22 de jun. de 2020 às 02:53, David Rowley <dgrowle...@gmail.com>
escreveu:


> On Mon, 22 Jun 2020 at 16:54, David Rowley <dgrowle...@gmail.com> wrote:
> > I also tested this an AMD machine running Ubuntu 20.04 on kernel
> > version 5.4.0-37.  I used the same 100GB table I mentioned in [1], but
> > with the query "select * from t where a < 0;", which saves having to
> > do any aggregate work.
>
> I just wanted to add a note here that Thomas and I just discussed this
> a bit offline. He recommended I try setting the kernel readhead a bit
> higher.
>
> It was set to 128kB, so I cranked it up to 2MB with:
>
> sudo blockdev --setra 4096 /dev/nvme0n1p2
>
> I didn't want to run the full test again as it took quite a long time,
> so I just tried with 32 workers.
>
> The first two results here are taken from the test results I just
> posted 1 hour ago.
>
> Master readhead=128kB = 89921.283 ms
> v2 patch readhead=128kB = 36085.642 ms
> master readhead=2MB = 60984.905 ms
> v2 patch readhead=2MB = 22611.264 ms
>

Hi, redoing the tests with v2 here.
notebook with i5, 8GB, 256 GB (SSD)
Windows 10 64 bits (2004
msvc 2019 64 bits
Postgresql head (with v2 patch)
Configuration: none
Connection local ipv4 (not localhost)

create table t (a int, b text);
insert into t select x,md5(x::text) from
generate_series(1,1000000*1572.7381809)x;
vacuum freeze t;

set max_parallel_workers_per_gather = 0;
Time: 354211,826 ms (05:54,212)
set max_parallel_workers_per_gather = 1;
Time: 332805,773 ms (05:32,806)
set max_parallel_workers_per_gather = 2;
Time: 282566,711 ms (04:42,567)
set max_parallel_workers_per_gather = 3;
Time: 263383,945 ms (04:23,384)
set max_parallel_workers_per_gather = 4;
Time: 255728,259 ms (04:15,728)
set max_parallel_workers_per_gather = 5;
Time: 238288,720 ms (03:58,289)
set max_parallel_workers_per_gather = 6;
Time: 238647,792 ms (03:58,648)
set max_parallel_workers_per_gather = 7;
Time: 231295,763 ms (03:51,296)
set max_parallel_workers_per_gather = 8;
Time: 232502,828 ms (03:52,503)
set max_parallel_workers_per_gather = 9;
Time: 230970,604 ms (03:50,971)
set max_parallel_workers_per_gather = 10;
Time: 232104,182 ms (03:52,104)

set max_parallel_workers_per_gather = 8;
postgres=# explain select count(*) from t;
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=15564556.43..15564556.44 rows=1 width=8)
   ->  Gather  (cost=15564555.60..15564556.41 rows=8 width=8)
         Workers Planned: 8
         ->  Partial Aggregate  (cost=15563555.60..15563555.61 rows=1
width=8)
               ->  Parallel Seq Scan on t  (cost=0.00..15072074.88
rows=196592288 width=0)
(5 rows)

Questions:
1. Why acquire and release lock in retry: loop.

Wouldn't that be better?

    /* Grab the spinlock. */
    SpinLockAcquire(&pbscan->phs_mutex);

retry:
/*
* If the scan's startblock has not yet been initialized, we must do so
* now.  If this is not a synchronized scan, we just start at block 0, but
* if it is a synchronized scan, we must get the starting position from
* the synchronized scan machinery.  We can't hold the spinlock while
* doing that, though, so release the spinlock, get the information we
* need, and retry.  If nobody else has initialized the scan in the
* meantime, we'll fill in the value we fetched on the second time
* through.
*/
if (pbscan->phs_startblock == InvalidBlockNumber)
{
if (!pbscan->base.phs_syncscan)
pbscan->phs_startblock = 0;
else if (sync_startpage != InvalidBlockNumber)
pbscan->phs_startblock = sync_startpage;
else
{
sync_startpage = ss_get_location(rel, pbscan->phs_nblocks);
goto retry;
}
}
    SpinLockRelease(&pbscan->phs_mutex);
}

Acquire lock once, before retry?

2. Is there any configuration to improve performance?

regards,
Ranier Vilela

Re: Parallel Seq Scan vs kernel read ahead

Reply via email to