On Fri, Aug 30, 2024 at 1:00 AM Alexander Lakhin wrote:
> I looked at two perf profiles of such out-of-sync processes and found no
> extra calls or whatsoever in the slow one, it just has the number of perf
> samples increased proportionally. It made me suspect CPU frequency
> scaling... Indeed, w
Hello Thomas,
29.08.2024 01:16, Thomas Munro wrote:
Yeah. That's quite interesting, and must destabilise that
simple-minded demo. I'm curious to know exactly what contention is
causing that (about 3/4 of a millisecond that I don't see and now I
want to know what it's waiting for), but it's a v
On Wed, Aug 28, 2024 at 9:00 PM Alexander Lakhin wrote:
> 2024-08-28 08:41:59.304 UTC [338251] DEBUG: starting background worker
> process "parallel worker for PID 338262"
> 2024-08-28 08:41:59.304 UTC [338251] DEBUG: starting background worker
> process "parallel worker for PID 338262"
> 202
28.08.2024 07:45, Thomas Munro wrote:
Huh. I can reproduce what I showed with pretty low variance, on my
FreeBSD, Linux and macOS systems. I included
parallel_leader_participation=off so that the workers would hopefully
start as closely together in time as possible, and hopefully allow
only a b
On Wed, Aug 28, 2024 at 1:00 AM Alexander Lakhin wrote:
> gives me unstable numbers on unpatched master:
> Buffers: shared hit=226
> Buffers: shared hit=217
> Buffers: shared hit=162
> Buffers: shared
Hello Thomas,
27.08.2024 09:52, Thomas Munro wrote:
Here's a really simple way to see the new unfairness at the end of a
parallel scan:
drop table if exists t;
create table t (i int);
insert into t select generate_series(1, 10);
alter table t set (parallel_workers = 2);
set parallel_setup_c
Here's a really simple way to see the new unfairness at the end of a
parallel scan:
drop table if exists t;
create table t (i int);
insert into t select generate_series(1, 10);
alter table t set (parallel_workers = 2);
set parallel_setup_cost = 0;
set parallel_leader_participation = off;
expla
On Tue, May 21, 2024 at 9:11 AM Melanie Plageman
wrote:
> So, if you are seeing the slow-down mostly go away by reducing
> blocknums array size, does the regression only appear when the scan
> data is fully in shared buffers? Or is this blocknums other use
> (dealing with short reads)?
That must
Thank you to all of you for looking into this.
On Sat, May 18, 2024 at 12:47 AM Thomas Munro wrote:
>
> On Sat, May 18, 2024 at 11:30 AM Thomas Munro wrote:
> > Andres happened to have TPC-DS handy, and reproduced that regression
> > in q15. We tried some stuff and figured out that it requires
On Sun, May 19, 2024 at 7:00 AM Alexander Lakhin wrote:
> With blocknums[1], timing is changed, but the effect is not persistent.
> 10 query15 executions in a row, b7b0f3f27:
> 277.932 ms
> 281.805 ms
> 278.335 ms
> 281.565 ms
> 284.167 ms
> 283.171 ms
> 281.165 ms
> 281.615 ms
> 285.394 ms
> 277.
Hello Thomas,
18.05.2024 07:47, Thomas Munro wrote:
After more debugging, we learned a lot more things...
1. That query produces spectacularly bad estimates, so we finish up
having to increase the number of buckets in a parallel hash join many
times. That is quite interesting, but unrelated t
On Sat, May 18, 2024 at 11:30 AM Thomas Munro wrote:
> Andres happened to have TPC-DS handy, and reproduced that regression
> in q15. We tried some stuff and figured out that it requires
> parallel_leader_participation=on, ie that this looks like some kind of
> parallel fairness and/or timing pro
On Sat, May 18, 2024 at 8:09 AM Thomas Munro wrote:
> On Sat, May 18, 2024 at 1:00 AM Alexander Lakhin wrote:
> > I decided to compare v17 vs v16 performance (as I did the last year [1])
> > and discovered that v17 loses to v16 in the pg_tpcds (s64da_tpcds)
> > benchmark, query15 (and several oth
On Sat, May 18, 2024 at 1:00 AM Alexander Lakhin wrote:
> I decided to compare v17 vs v16 performance (as I did the last year [1])
> and discovered that v17 loses to v16 in the pg_tpcds (s64da_tpcds)
> benchmark, query15 (and several others, but I focused on this one):
> Best pg-src-master--.* wor
Hello,
I decided to compare v17 vs v16 performance (as I did the last year [1])
and discovered that v17 loses to v16 in the pg_tpcds (s64da_tpcds)
benchmark, query15 (and several others, but I focused on this one):
Best pg-src-master--.* worse than pg-src-16--.* by 52.2 percents (229.84 >
151.03
Hi,
On 2024-04-08 09:36:59 +1200, Thomas Munro wrote:
> I've been on the fence about that flag for sequential scan... Some
> days I want to consider changing to READ_STREAM_DEFAULT and relying on
> our "anti-heuristics" to suppress advice, which would work out the
> same in most cases but might o
On Sun, Apr 7, 2024 at 1:34 PM Melanie Plageman
wrote:
> Attached v13 0001 is your fix and 0002 is a new version of the
> sequential scan streaming read user. Off-list Andres mentioned that I
> really ought to separate the parallel and serial sequential scan users
> into two different callbacks. I
On Sat, Apr 6, 2024 at 9:25 PM Thomas Munro wrote:
>
> I found a bug in read_stream.c that could be hit with Melanie's
> streaming seq scan patch with parallelism enabled and certain buffer
> pool conditions. Short version: there is an edge case where an "if"
> needed to be a "while", or we could
I found a bug in read_stream.c that could be hit with Melanie's
streaming seq scan patch with parallelism enabled and certain buffer
pool conditions. Short version: there is an edge case where an "if"
needed to be a "while", or we could lose a few blocks. Here's the fix
for that, longer explanati
On Fri, Apr 5, 2024 at 7:28 PM Thomas Munro wrote:
>
> On Sat, Apr 6, 2024 at 6:55 AM Melanie Plageman
> wrote:
> > I haven't looked into or reviewed the memory prefetching part.
> >
> > While reviewing 0002, I realized that I don't quite see how
> > read_stream_get_block() will be used in the fa
On Sat, Apr 6, 2024 at 6:55 AM Melanie Plageman
wrote:
> On Fri, Apr 5, 2024 at 12:15 AM Thomas Munro wrote:
> > The interesting column is hot. The 200ms->211ms regression is due to
> > the extra bookkeeping in the slow path. The rejiggered fastpath code
> > fixes it for me, or maybe sometimes
On Fri, Apr 5, 2024 at 12:15 AM Thomas Munro wrote:
>
> Yeah, I plead benchmarking myopia, sorry. The fastpath as committed
> is only reached when distance goes 2->1, as pg_prewarm does. Oops.
> With the attached minor rearrangement, it works fine. I also poked
> some more at that memory prefet
On Thu, Apr 4, 2024 at 12:39 PM Andres Freund wrote:
>
> On 2024-04-04 22:37:39 +1300, Thomas Munro wrote:
> > On Thu, Apr 4, 2024 at 10:31 PM Thomas Munro wrote:
> > > Alright what about this?
>
> I think it's probably worth adding a bit more of the commit message to the
> function comment. Yes,
Yeah, I plead benchmarking myopia, sorry. The fastpath as committed
is only reached when distance goes 2->1, as pg_prewarm does. Oops.
With the attached minor rearrangement, it works fine. I also poked
some more at that memory prefetcher. Here are the numbers I got on a
desktop system (Intel i9
On Fri, Apr 5, 2024 at 4:20 AM Melanie Plageman
wrote:
> So, sequential scan does not have per-buffer data. I did some logging
> and the reason most fully-in-SB sequential scans don't use the fast
> path is because read_stream->pending_read_nblocks is always 0.
Hnghghghgh... right, sorry I guesse
Hi,
On 2024-04-04 22:37:39 +1300, Thomas Munro wrote:
> On Thu, Apr 4, 2024 at 10:31 PM Thomas Munro wrote:
> > Alright what about this?
I think it's probably worth adding a bit more of the commit message to the
function comment. Yes, there's a bit in one of the return branches, but that's
not w
On Thu, Apr 4, 2024 at 7:45 AM Thomas Munro wrote:
>
> On Thu, Apr 4, 2024 at 8:02 PM David Rowley wrote:
> > 3a4a3537a
> > latency average = 34.497 ms
> > latency average = 34.538 ms
> >
> > 3a4a3537a + read_stream_for_seqscans.patch
> > latency average = 40.923 ms
> > latency average = 41.415 m
On Thu, Apr 4, 2024 at 3:02 AM David Rowley wrote:
>
> On Thu, 4 Apr 2024 at 16:45, David Rowley wrote:
> > I've pushed the v9-0001 with that rename done.
>
> I've now just pushed the 0002 patch with some revisions:
Thanks!
> 1. The function declarations you added for heapgettup_advance_block()
On Thu, Apr 4, 2024 at 8:02 PM David Rowley wrote:
> 3a4a3537a
> latency average = 34.497 ms
> latency average = 34.538 ms
>
> 3a4a3537a + read_stream_for_seqscans.patch
> latency average = 40.923 ms
> latency average = 41.415 ms
>
> i.e. no meaningful change from the refactor, but a regression fr
On Thu, Apr 4, 2024 at 10:31 PM Thomas Munro wrote:
> Alright what about this?
Forgot to git add a change, new version.
From 6dea2983abf8d608c34e02351d70694de99f25f2 Mon Sep 17 00:00:00 2001
From: Thomas Munro
Date: Thu, 4 Apr 2024 20:31:26 +1300
Subject: [PATCH v2 1/2] Allow BufferAccessStrateg
On Thu, Apr 4, 2024 at 11:13 AM Andres Freund wrote:
> The / 2 is to avoid causing unnecessarily frequent WAL flushes, right? If so,
> should we apply that only if the ring the strategy doesn't use the
> StrategyRejectBuffer() logic?
Hmm, I don't really know, but that sounds plausible. What do y
On Thu, 4 Apr 2024 at 16:45, David Rowley wrote:
> I've pushed the v9-0001 with that rename done.
I've now just pushed the 0002 patch with some revisions:
1. The function declarations you added for heapgettup_advance_block()
and heapgettup_initial_block() didn't match the properties of their
def
On Thu, 4 Apr 2024 at 14:38, Melanie Plageman wrote:
> Looking at it in the code, I am wondering if we should call
> heap_page_prep() heap_scan_page_prep(). Just wondering if it is clear
> that it is prepping a page to be scanned. You choose whatever you
> think is best.
I ended up calling it hea
On Wed, Apr 3, 2024 at 9:08 PM David Rowley wrote:
>
> On Thu, 4 Apr 2024 at 06:03, Melanie Plageman
> wrote:
> > Attached v8 is rebased over current master (which now has the
> > streaming read API).
>
> I've looked at the v8-0001 patch.
Thanks for taking a look!
> I wasn't too keen on heapbu
On Thu, 4 Apr 2024 at 06:03, Melanie Plageman wrote:
> Attached v8 is rebased over current master (which now has the
> streaming read API).
I've looked at the v8-0001 patch.
I wasn't too keen on heapbuildvis() as a function name for an external
function. Since it also does pruning work, it seem
Hi,
On 2024-04-04 09:27:35 +1300, Thomas Munro wrote:
> Anecdotally by all reports I've seen so far, all-in-cache seems to be
> consistently a touch faster than master if anything, for streaming seq
> scan and streaming ANALYZE. That's great!, but it doesn't seem to be
> due to intentional change
On Wed, Apr 3, 2024 at 4:28 PM Thomas Munro wrote:
>
> On Thu, Apr 4, 2024 at 6:03 AM Melanie Plageman
> wrote:
> > On the topic of BAS_BULKREAD buffer access strategy, I think the least
> > we could do is add an assert like this to read_stream_begin_relation()
> > after calculating max_pinned_bu
On Thu, Apr 4, 2024 at 6:03 AM Melanie Plageman
wrote:
> On Tue, Apr 2, 2024 at 1:10 PM Heikki Linnakangas wrote:
> > On 01/04/2024 22:58, Melanie Plageman wrote:
> > > Attached v7 has version 14 of the streaming read API as well as a few
> > > small tweaks to comments and code.
> >
> > I saw ben
On Tue, Apr 2, 2024 at 1:10 PM Heikki Linnakangas wrote:
>
> On 01/04/2024 22:58, Melanie Plageman wrote:
> > Attached v7 has version 14 of the streaming read API as well as a few
> > small tweaks to comments and code.
>
> I saw benchmarks in this thread to show that there's no regression when
> t
On 01/04/2024 22:58, Melanie Plageman wrote:
Attached v7 has version 14 of the streaming read API as well as a few
small tweaks to comments and code.
I saw benchmarks in this thread to show that there's no regression when
the data is in cache, but I didn't see any benchmarks demonstrating the
On Wed, Mar 27, 2024 at 08:47:03PM -0400, Melanie Plageman wrote:
> On Fri, Mar 8, 2024 at 4:56 PM Melanie Plageman
> wrote:
> >
> > On Sat, Mar 02, 2024 at 06:07:48PM -0500, Melanie Plageman wrote:
> > > On Wed, Feb 28, 2024 at 12:30 PM Melanie Plageman
> > > wrote:
> > > >
> > > > On Mon, Feb 2
On Fri, Mar 8, 2024 at 4:56 PM Melanie Plageman
wrote:
>
> On Sat, Mar 02, 2024 at 06:07:48PM -0500, Melanie Plageman wrote:
> > On Wed, Feb 28, 2024 at 12:30 PM Melanie Plageman
> > wrote:
> > >
> > > On Mon, Feb 26, 2024 at 03:56:57PM -0500, Melanie Plageman wrote:
> > > > On Mon, Feb 19, 2024
On Sat, Mar 02, 2024 at 06:07:48PM -0500, Melanie Plageman wrote:
> On Wed, Feb 28, 2024 at 12:30 PM Melanie Plageman
> wrote:
> >
> > On Mon, Feb 26, 2024 at 03:56:57PM -0500, Melanie Plageman wrote:
> > > On Mon, Feb 19, 2024 at 6:05 PM Melanie Plageman
> > > wrote:
> > > >
> > > > On Mon, Jan
On Wed, Feb 28, 2024 at 12:30 PM Melanie Plageman
wrote:
>
> On Mon, Feb 26, 2024 at 03:56:57PM -0500, Melanie Plageman wrote:
> > On Mon, Feb 19, 2024 at 6:05 PM Melanie Plageman
> > wrote:
> > >
> > > On Mon, Jan 29, 2024 at 4:17 PM Melanie Plageman
> > > wrote:
> > > >
> > > > There is an out
On Mon, Feb 26, 2024 at 03:56:57PM -0500, Melanie Plageman wrote:
> On Mon, Feb 19, 2024 at 6:05 PM Melanie Plageman
> wrote:
> >
> > On Mon, Jan 29, 2024 at 4:17 PM Melanie Plageman
> > wrote:
> > >
> > > There is an outstanding question about where to allocate the
> > > PgStreamingRead object f
On Mon, Feb 19, 2024 at 6:05 PM Melanie Plageman
wrote:
>
> On Mon, Jan 29, 2024 at 4:17 PM Melanie Plageman
> wrote:
> >
> > There is an outstanding question about where to allocate the
> > PgStreamingRead object for sequential scans
>
> I've written three alternative implementations of the actu
On Tue, Feb 20, 2024 at 6:13 AM Robert Haas wrote:
>
> On Tue, Feb 20, 2024 at 4:35 AM Melanie Plageman
> wrote:
> > I've written three alternative implementations of the actual streaming
> > read user for sequential scan which handle the question of where to
> > allocate the streaming read objec
On Tue, Feb 20, 2024 at 4:35 AM Melanie Plageman
wrote:
> I've written three alternative implementations of the actual streaming
> read user for sequential scan which handle the question of where to
> allocate the streaming read object and how to handle changing scan
> direction in different ways.
On Mon, Jan 29, 2024 at 4:17 PM Melanie Plageman
wrote:
>
> There is an outstanding question about where to allocate the
> PgStreamingRead object for sequential scans
I've written three alternative implementations of the actual streaming
read user for sequential scan which handle the question of
On Mon, Jan 29, 2024 at 4:24 PM David Rowley wrote:
>
> On Tue, 30 Jan 2024 at 10:17, Melanie Plageman
> wrote:
> > Though logically the performance with 0001 and 0002 should be the same
> > as master (no new non-inline function calls, no additional looping),
> > I've done a bit of profiling anyw
On Tue, 30 Jan 2024 at 10:17, Melanie Plageman
wrote:
> Though logically the performance with 0001 and 0002 should be the same
> as master (no new non-inline function calls, no additional looping),
> I've done a bit of profiling anyway. I created a large multi-GB table,
> read it all into shared b
51 matches
Mail list logo