Re: index prefetching

2025-12-08 Thread Amit Langote
On Fri, Dec 5, 2025 at 6:11 AM Peter Geoghegan wrote: > On Thu, Dec 4, 2025 at 12:54 AM Amit Langote wrote: > > I want to acknowledge that figuring out the right layering to make I/O > > prefetching and perhaps other optimizations internal to IndexNext() > > work is obviously the priority right n

Re: index prefetching

2025-12-04 Thread Peter Geoghegan
Hi Amit, On Thu, Dec 4, 2025 at 12:54 AM Amit Langote wrote: > I want to acknowledge that figuring out the right layering to make I/O > prefetching and perhaps other optimizations internal to IndexNext() > work is obviously the priority right now, regardless of the output > format used to populat

Re: index prefetching

2025-12-03 Thread Amit Langote
Hi Peter, On Mon, Dec 1, 2025 at 10:24 AM Peter Geoghegan wrote: > On Mon, Nov 10, 2025 at 6:59 PM Peter Geoghegan wrote: > > The new tentative plan is to cut scope by focussing on switching over > > to the new index AM + table AM interface from the patch in the short > > term, for Postgres 19.

Re: index prefetching

2025-12-01 Thread Peter Geoghegan
On Mon, Dec 1, 2025 at 11:32 AM Tomas Vondra wrote: > Thanks for the new version! I like the layering in this patch, moving > some of the stuff from indexam.c/executor to table AM. It makes some of > the code much cleaner, I think. Yeah, I think so too. Clearly the main way that this improves the

Re: index prefetching

2025-12-01 Thread Tomas Vondra
On 12/1/25 02:23, Peter Geoghegan wrote: > On Mon, Nov 10, 2025 at 6:59 PM Peter Geoghegan wrote: >> The new tentative plan is to cut scope by focussing on switching over >> to the new index AM + table AM interface from the patch in the short >> term, for Postgres 19. > > Attached patch makes the

Re: index prefetching

2025-11-24 Thread Peter Geoghegan
On Mon, Nov 24, 2025 at 3:48 PM Andres Freund wrote: > Huh. I wouldn't have expected -march=native to make a huge difference... Me neither. On the other hand I find that this area is quite sensitive to icache misses and branch misprediction penalties. This is partly due to my holding the patch to

Re: index prefetching

2025-11-24 Thread Andres Freund
Hi, On 2025-11-23 19:03:44 -0500, Peter Geoghegan wrote: > On Fri, Nov 21, 2025 at 6:31 PM Andres Freund wrote: > > On 2025-11-21 18:14:56 -0500, Peter Geoghegan wrote: > > > On Fri, Nov 21, 2025 at 5:38 PM Andres Freund wrote: > > > > Another benfit is that it helps even more when there multipl

Re: index prefetching

2025-11-23 Thread Peter Geoghegan
On Fri, Nov 21, 2025 at 6:31 PM Andres Freund wrote: > On 2025-11-21 18:14:56 -0500, Peter Geoghegan wrote: > > On Fri, Nov 21, 2025 at 5:38 PM Andres Freund wrote: > > > Another benfit is that it helps even more when there multiple queries > > > running > > > concurrently - the high rate of loc

Re: index prefetching

2025-11-21 Thread Andres Freund
Hi, On 2025-11-21 18:14:56 -0500, Peter Geoghegan wrote: > On Fri, Nov 21, 2025 at 5:38 PM Andres Freund wrote: > > Another benfit is that it helps even more when there multiple queries > > running > > concurrently - the high rate of lock/unlock on the buffer rather badly hurts > > scalability.

Re: index prefetching

2025-11-21 Thread Peter Geoghegan
On Fri, Nov 21, 2025 at 5:38 PM Andres Freund wrote: > Another benfit is that it helps even more when there multiple queries running > concurrently - the high rate of lock/unlock on the buffer rather badly hurts > scalability. I haven't noticed that effect myself. In fact, it seemed to be the oth

Re: index prefetching

2025-11-21 Thread Andres Freund
Hi, On 2025-11-10 18:59:07 -0500, Peter Geoghegan wrote: > Tomas and I had a meeting on Friday to discuss a way forward with this > project. Progress has stalled, and we feel that now is a good time to > pivot by refactoring the patch into smaller, independently > useful/committable pieces. Makes

Re: index prefetching

2025-11-14 Thread Peter Geoghegan
On Wed, Nov 12, 2025 at 12:39 PM Tomas Vondra wrote: > I think I generally agree with what you said here about the challenges, > although it's a bit too abstract to respond to individual parts. I just > don't know how to rework the design to resolve this ... I'm trying to identify which subsets o

Re: index prefetching

2025-11-12 Thread Tomas Vondra
On 11/11/25 00:59, Peter Geoghegan wrote: > On Sun, Nov 2, 2025 at 6:49 PM Peter Geoghegan wrote: >> Nothing really new here (I've been >> working on batching on the table AM side, but nothing to show on that >> just yet). > > Tomas and I had a meeting on Friday to discuss a way forward with this

Re: index prefetching

2025-11-10 Thread Peter Geoghegan
On Sun, Nov 2, 2025 at 6:49 PM Peter Geoghegan wrote: > Nothing really new here (I've been > working on batching on the table AM side, but nothing to show on that > just yet). Tomas and I had a meeting on Friday to discuss a way forward with this project. Progress has stalled, and we feel that no

Re: index prefetching

2025-11-02 Thread Tomas Vondra
On 11/3/25 00:49, Peter Geoghegan wrote: > On Sun, Oct 12, 2025 at 2:52 PM Peter Geoghegan wrote: >> Attached in a new revision, mostly just to keep CFBot happy following >> a recent trivial conflict introduced on master. > > Attached is another revision, also just to keep CFBot happy followin

Re: index prefetching

2025-09-15 Thread Tomas Vondra
On 9/15/25 17:12, Peter Geoghegan wrote: > On Mon, Sep 15, 2025 at 9:00 AM Tomas Vondra wrote: >> Yeah, this heuristics seems very effective in eliminating the regression >> (at least judging by the test results I've seen so far). Two or three >> question bother me about it, though: > > I more or

Re: index prefetching

2025-09-15 Thread Peter Geoghegan
On Mon, Sep 15, 2025 at 9:00 AM Tomas Vondra wrote: > Yeah, this heuristics seems very effective in eliminating the regression > (at least judging by the test results I've seen so far). Two or three > question bother me about it, though: I more or less agree with all of your concerns about the IN

Re: index prefetching

2025-09-07 Thread Peter Geoghegan
On Wed, Sep 3, 2025 at 4:06 PM Andres Freund wrote: > The issue to me is that this kind of query actually *can* substantially > benefit from prefetching, no? As far as I can tell, not really, no. > Afaict the performance without prefetching is > rather atrocious as soon as a) storage has a tad h

Re: index prefetching

2025-09-04 Thread Tomas Vondra
On 9/3/25 22:06, Andres Freund wrote: > ... > > I continue to be worried that we're optimizing for queries that have no > real-world relevance. The regression afaict is contingent on > > 1) An access pattern that is unpredictable to the CPU (due to the use of >random() as part of ORDER BY duri

Re: index prefetching

2025-09-03 Thread Peter Geoghegan
On Wed, Sep 3, 2025 at 8:16 PM Andres Freund wrote: > > I don't see that level of improvement with DIO. For me it's 6054.921 > > ms with prefetching, 8766.287 ms without it. > > I guess your SSD has lower latency than mine... It's nothing special: a 4 year old Samsung 980 pro. > This actually mi

Re: index prefetching

2025-09-03 Thread Andres Freund
Hi, On 2025-09-03 16:25:56 -0400, Peter Geoghegan wrote: > On Wed, Sep 3, 2025 at 4:06 PM Andres Freund wrote: > > The issue to me is that this kind of query actually *can* substantially > > benefit from prefetching, no? > > As far as I can tell, not really, no. It seems to here - I see small wi

Re: index prefetching

2025-09-03 Thread Peter Geoghegan
On Wed, Sep 3, 2025 at 2:47 PM Andres Freund wrote: > I still don't think I fully understand why the impact of this is so large. The > branch misses appear to be the only thing differentiating the two cases, but > with resowners neutralized, the remaining difference in branch misses seems > too la

Re: index prefetching

2025-09-03 Thread Andres Freund
Hi, On 2025-09-03 15:33:30 -0400, Peter Geoghegan wrote: > On Wed, Sep 3, 2025 at 2:47 PM Andres Freund wrote: > > I still don't think I fully understand why the impact of this is so large. > > The > > branch misses appear to be the only thing differentiating the two cases, but > > with resowner

Re: index prefetching

2025-09-03 Thread Andres Freund
Hi, I spent a fair bit more time analyzing this issue. On 2025-08-28 21:10:48 -0400, Andres Freund wrote: > On 2025-08-28 19:57:17 -0400, Peter Geoghegan wrote: > > On Thu, Aug 28, 2025 at 7:52 PM Tomas Vondra wrote: > > I'm not sure that Thomas'/your patch to ameliorate the problem on the > >

Re: index prefetching

2025-08-28 Thread Thomas Munro
On Fri, Aug 29, 2025 at 11:52 AM Tomas Vondra wrote: > True. But one worker did show up in top, using a fair amount of CPU, so > why wouldn't the others (if they process the same stream)? It deliberately concentrates wakeups into the lowest numbered workers that are marked idle in a bitmap. * hi

Re: index prefetching

2025-08-28 Thread Peter Geoghegan
On Thu, Aug 28, 2025 at 9:10 PM Andres Freund wrote: > Same. Tomas, could you share what you applied? Tomas posted a self-contained patch to the list about an hour ago? > > I'm not sure that Thomas'/your patch to ameliorate the problem on the > > read stream side is essential here. Perhaps Andr

Re: index prefetching

2025-08-28 Thread Andres Freund
Hi, On 2025-08-28 19:57:17 -0400, Peter Geoghegan wrote: > On Thu, Aug 28, 2025 at 7:52 PM Tomas Vondra wrote: > > Use this branch: > > > > https://github.com/tvondra/postgres/commits/index-prefetch-master/ > > > > and then Thomas' patch that increases the prefetch distance: > > > > > > https:/

Re: index prefetching

2025-08-28 Thread Tomas Vondra
On 8/29/25 01:57, Peter Geoghegan wrote: > On Thu, Aug 28, 2025 at 7:52 PM Tomas Vondra wrote: >> Use this branch: >> >> https://github.com/tvondra/postgres/commits/index-prefetch-master/ >> >> and then Thomas' patch that increases the prefetch distance: >> >> >> https://www.postgresql.org/mes

Re: index prefetching

2025-08-28 Thread Peter Geoghegan
On Thu, Aug 28, 2025 at 7:52 PM Tomas Vondra wrote: > Use this branch: > > https://github.com/tvondra/postgres/commits/index-prefetch-master/ > > and then Thomas' patch that increases the prefetch distance: > > > https://www.postgresql.org/message-id/CA%2BhUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BA

Re: index prefetching

2025-08-28 Thread Tomas Vondra
On 8/29/25 01:27, Andres Freund wrote: > Hi, > > On 2025-08-29 01:00:58 +0200, Tomas Vondra wrote: >> I'm not sure how to determine what concurrency it "wants". All I know is >> that for "warm" runs [1], the basic index prefetch patch uses distance >> ~2.0 on average, and is ~2x slower than master

Re: index prefetching

2025-08-28 Thread Tomas Vondra
On 8/28/25 21:52, Andres Freund wrote: > Hi, > > On 2025-08-28 19:08:40 +0200, Tomas Vondra wrote: >> On 8/28/25 18:16, Andres Freund wrote: So I think the IPC overhead with "worker" can be quite significant, especially for cases with distance=1. I don't think it's a major issue

Re: index prefetching

2025-08-28 Thread Andres Freund
Hi, On 2025-08-29 01:00:58 +0200, Tomas Vondra wrote: > I'm not sure how to determine what concurrency it "wants". All I know is > that for "warm" runs [1], the basic index prefetch patch uses distance > ~2.0 on average, and is ~2x slower than master. And with the patches the > distance is ~270, a

Re: index prefetching

2025-08-28 Thread Peter Geoghegan
On Thu, Aug 28, 2025 at 7:01 PM Tomas Vondra wrote: > I'm not sure how to determine what concurrency it "wants". All I know is > that for "warm" runs [1], the basic index prefetch patch uses distance > ~2.0 on average, and is ~2x slower than master. And with the patches the > distance is ~270, and

Re: index prefetching

2025-08-28 Thread Tomas Vondra
On 8/28/25 23:50, Thomas Munro wrote: > On Fri, Aug 29, 2025 at 7:52 AM Andres Freund wrote: >> On 2025-08-28 19:08:40 +0200, Tomas Vondra wrote: >>> From the 2x regression (compared to master) it might seem like that, but >>> even with the increased distance it's still slower than master (by 2

Re: index prefetching

2025-08-28 Thread Thomas Munro
On Fri, Aug 29, 2025 at 7:52 AM Andres Freund wrote: > On 2025-08-28 19:08:40 +0200, Tomas Vondra wrote: > > From the 2x regression (compared to master) it might seem like that, but > > even with the increased distance it's still slower than master (by 25%). So > > maybe the "error" is to use AIO

Re: index prefetching

2025-08-28 Thread Andres Freund
Hi, On 2025-08-28 19:08:40 +0200, Tomas Vondra wrote: > On 8/28/25 18:16, Andres Freund wrote: > >> So I think the IPC overhead with "worker" can be quite significant, > >> especially for cases with distance=1. I don't think it's a major issue > >> for PG18, because seq/bitmap scans are unlikely t

Re: index prefetching

2025-08-28 Thread Tomas Vondra
On 8/28/25 18:16, Andres Freund wrote: > Hi, > > On 2025-08-28 14:45:24 +0200, Tomas Vondra wrote: >> On 8/26/25 17:06, Tomas Vondra wrote: >> I kept thinking about this, and in the end I decided to try to measure >> this IPC overhead. The backend/ioworker communicate by sending signals, >> so I w

Re: index prefetching

2025-08-28 Thread Andres Freund
Hi, On 2025-08-28 14:45:24 +0200, Tomas Vondra wrote: > On 8/26/25 17:06, Tomas Vondra wrote: > I kept thinking about this, and in the end I decided to try to measure > this IPC overhead. The backend/ioworker communicate by sending signals, > so I wrote a simple C program that does "signal echo" w

Re: index prefetching

2025-08-28 Thread Andres Freund
Hi, On 2025-08-26 17:06:11 +0200, Tomas Vondra wrote: > On 8/26/25 01:48, Andres Freund wrote: > > Hi, > > > > On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote: > >> Thanks. Based on the testing so far, the patch seems to be a substantial > >> improvement. What's needed to make this prototype com

Re: index prefetching

2025-08-28 Thread Tomas Vondra
On 8/26/25 17:06, Tomas Vondra wrote: > > > On 8/26/25 01:48, Andres Freund wrote: >> Hi, >> >> On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote: >>> >>> ... >>> >>> I'm not sure what's causing this, but almost all regressions my script >>> is finding look like this - always io_method=worker, wi

Re: index prefetching

2025-08-26 Thread Tomas Vondra
On 8/26/25 01:48, Andres Freund wrote: > Hi, > > On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote: >> Thanks. Based on the testing so far, the patch seems to be a substantial >> improvement. What's needed to make this prototype committable? > > Mainly some testing infrastructure that can trigg

Re: index prefetching

2025-08-26 Thread Tomas Vondra
On 8/26/25 03:08, Peter Geoghegan wrote: > On Mon Aug 25, 2025 at 10:18 AM EDT, Tomas Vondra wrote: >> The attached patch is a PoC implementing this. The core idea is that if >> we measure "miss probability" for a chunk of requests, we can use that >> to estimate the distance needed to generate e_i

Re: index prefetching

2025-08-25 Thread Peter Geoghegan
On Mon Aug 25, 2025 at 10:18 AM EDT, Tomas Vondra wrote: > The attached patch is a PoC implementing this. The core idea is that if > we measure "miss probability" for a chunk of requests, we can use that > to estimate the distance needed to generate e_i_c IOs. I noticed an assertion failure when t

Re: index prefetching

2025-08-25 Thread Andres Freund
Hi, On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote: > Thanks. Based on the testing so far, the patch seems to be a substantial > improvement. What's needed to make this prototype committable? Mainly some testing infrastructure that can trigger this kind of stream. The logic is too finnicky for

Re: index prefetching

2025-08-25 Thread Peter Geoghegan
On Mon, Aug 25, 2025 at 2:33 PM Tomas Vondra wrote: > Right. I might have expressed it more clearly, but this is what I meant > when I said priorbatch is not causing this. Cool. > As for priorbatch, I'd still like to know where does the overhead come > from. I mean, what's the expensive part of

Re: index prefetching

2025-08-25 Thread Tomas Vondra
On 8/25/25 19:57, Peter Geoghegan wrote: > On Mon, Aug 25, 2025 at 10:18 AM Tomas Vondra wrote: >> Almost all regressions (at least the top ones) now look like this, i.e. >> distance collapses to ~2.0, which essentially disables prefetching. > > Good to know. > >> But I no longer think it's c

Re: index prefetching

2025-08-25 Thread Peter Geoghegan
On Mon, Aug 25, 2025 at 10:18 AM Tomas Vondra wrote: > Almost all regressions (at least the top ones) now look like this, i.e. > distance collapses to ~2.0, which essentially disables prefetching. Good to know. > But I no longer think it's caused by the "priorbatch" optimization, > which delays

Re: index prefetching

2025-08-25 Thread Tomas Vondra
On 8/25/25 17:43, Thomas Munro wrote: > On Tue, Aug 26, 2025 at 2:18 AM Tomas Vondra wrote: >> Of course, this can happen even with other hit ratios, there's nothing >> special about 50%. > > Right, that's what this patch was attacking directly, basically only > giving up when misses are so spars

Re: index prefetching

2025-08-25 Thread Tomas Vondra
On 8/25/25 16:18, Tomas Vondra wrote: > ... > > But with more hits, the hit/miss ratio simply determines the "stable" > distance. Let's say there's 80% hits, so 4 hits to 1 miss. Then the > stable distance is ~4, because we get a miss, double to 8, and then 4 > hits, so the distance drops back

Re: index prefetching

2025-08-25 Thread Thomas Munro
On Tue, Aug 26, 2025 at 2:18 AM Tomas Vondra wrote: > Of course, this can happen even with other hit ratios, there's nothing > special about 50%. Right, that's what this patch was attacking directly, basically only giving up when misses are so sparse we can't do anything about it for an ordered s

Re: index prefetching

2025-08-25 Thread Tomas Vondra
On 8/20/25 00:27, Peter Geoghegan wrote: > On Tue, Aug 19, 2025 at 2:22 PM Peter Geoghegan wrote: >> That definitely seems like a problem. I think that you're saying that >> this problem happens because we have extra buffer hits earlier on, >> which is enough to completely change the ramp-up behav

Re: index prefetching

2025-08-25 Thread Tomas Vondra
On 8/15/25 17:09, Andres Freund wrote: > Hi, > > On 2025-08-14 19:36:49 -0400, Andres Freund wrote: >> On 2025-08-14 17:55:53 -0400, Peter Geoghegan wrote: >>> On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan wrote: > We can optimize that by deferring the StartBufferIO() if we're > encoun

Re: index prefetching

2025-08-19 Thread Peter Geoghegan
On Tue, Aug 19, 2025 at 2:22 PM Peter Geoghegan wrote: > That definitely seems like a problem. I think that you're saying that > this problem happens because we have extra buffer hits earlier on, > which is enough to completely change the ramp-up behavior. This seems > to be all it takes to dramat

Re: index prefetching

2025-08-19 Thread Peter Geoghegan
On Tue, Aug 19, 2025 at 1:23 PM Tomas Vondra wrote: > Thanks for investigating this. I think it's the right direction - simple > OLTP queries should not be paying for building read_stream when there's > little chance of benefit. > > Unfortunately, this seems to be causing regressions, both compare

Re: index prefetching

2025-08-19 Thread Tomas Vondra
On 8/17/25 19:30, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 10:12 PM Peter Geoghegan wrote: >> As far as I know, we only have the following unambiguous performance >> regressions (that clearly need to be fixed): >> >> 1. This issue. >> >> 2. There's about a 3% loss of throughput on pgbench

Re: index prefetching

2025-08-17 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 10:12 PM Peter Geoghegan wrote: > As far as I know, we only have the following unambiguous performance > regressions (that clearly need to be fixed): > > 1. This issue. > > 2. There's about a 3% loss of throughput on pgbench SELECT. Update: I managed to fix the performance

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Fri, Aug 15, 2025 at 3:45 PM Andres Freund wrote: > > My shared_buffers is 16GB, with pgbench scale 300. > > So there's actually no IO, given that a scale 300 is something like 4.7GB? In > that case my patch could really not make a difference, neither of the changed > branches would ever be rea

Re: index prefetching

2025-08-15 Thread Andres Freund
Hi, On 2025-08-15 15:42:10 -0400, Peter Geoghegan wrote: > On Fri, Aug 15, 2025 at 3:38 PM Andres Freund wrote: > > I see absolutely no effect of the patch with shared_buffers=1GB and a > > read-only scale 200 pgbench at 40 clients. What data sizes, shared buffers > > etc. were you testing? > >

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Fri, Aug 15, 2025 at 3:38 PM Andres Freund wrote: > I see absolutely no effect of the patch with shared_buffers=1GB and a > read-only scale 200 pgbench at 40 clients. What data sizes, shared buffers > etc. were you testing? Just to be clear: you are testing with both the index prefetching patc

Re: index prefetching

2025-08-15 Thread Andres Freund
Hi, On 2025-08-15 15:31:47 -0400, Peter Geoghegan wrote: > On Fri, Aug 15, 2025 at 3:28 PM Andres Freund wrote: > > >I'm not worried about it. Andres' "not waiting for already-in-progress > > >IO" patch was clearly just a prototype. Just thought it was worth > > >noting here. > > > > Are you conf

Re: index prefetching

2025-08-15 Thread Andres Freund
Hi, On August 15, 2025 3:25:50 PM EDT, Peter Geoghegan wrote: >On Thu, Aug 14, 2025 at 10:12 PM Peter Geoghegan wrote: >> As far as I know, we only have the following unambiguous performance >> regressions (that clearly need to be fixed): >> >> 1. This issue. >> >> 2. There's about a 3% loss of

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Fri, Aug 15, 2025 at 3:28 PM Andres Freund wrote: > >I'm not worried about it. Andres' "not waiting for already-in-progress > >IO" patch was clearly just a prototype. Just thought it was worth > >noting here. > > Are you confident in that? Because the patch should be extremely cheap in > that

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 10:12 PM Peter Geoghegan wrote: > As far as I know, we only have the following unambiguous performance > regressions (that clearly need to be fixed): > > 1. This issue. > > 2. There's about a 3% loss of throughput on pgbench SELECT. I did a quick pgbench SELECT benchmark a

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Fri, Aug 15, 2025 at 1:23 PM Andres Freund wrote: > Somewhat random note about I/O waits: > > Unfortunately the I/O wait time we measure often massively *over* estimate the > actual I/O time. If I execute the above query with the patch applied, we > actually barely ever wait for I/O to complete

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Fri, Aug 15, 2025 at 1:09 PM Andres Freund wrote: > On 2025-08-15 12:29:25 -0400, Peter Geoghegan wrote: > > FWIW, this development probably completely changes the results of many > > (all?) of your benchmark queries. My guess is that with Andres' patch, > > things will be better across the boa

Re: index prefetching

2025-08-15 Thread Andres Freund
Hi, On 2025-08-15 12:24:40 -0400, Peter Geoghegan wrote: > With bufmgr patch > - > > ┌─┐ > │ QUERY PLAN │ > ├─┤

Re: index prefetching

2025-08-15 Thread Andres Freund
Hi, Glad to see that the prototype does fix the issue for you. On 2025-08-15 12:29:25 -0400, Peter Geoghegan wrote: > FWIW, this development probably completely changes the results of many > (all?) of your benchmark queries. My guess is that with Andres' patch, > things will be better across the

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Fri, Aug 15, 2025 at 12:24 PM Peter Geoghegan wrote: > Good news here: with Andres' bufmgr patch applied, the similar forwards scan > query does indeed get more than 2x faster. And I don't mean that it gets > faster on the randomized table -- it actually gets 2x faster with your > original (al

Re: index prefetching

2025-08-15 Thread Peter Geoghegan
On Thu Aug 14, 2025 at 7:26 PM EDT, Tomas Vondra wrote: >> My guess is that once we fix the underlying problem, we'll see >> improved performance for many different types of queries. Not as big >> of a benefit as the one that the broken query will get, but still >> enough to matter. >> > > Hopeful

Re: index prefetching

2025-08-15 Thread Andres Freund
Hi, On 2025-08-14 19:36:49 -0400, Andres Freund wrote: > On 2025-08-14 17:55:53 -0400, Peter Geoghegan wrote: > > On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan wrote: > > > > We can optimize that by deferring the StartBufferIO() if we're > > > > encountering a > > > > buffer that is undergoing

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 7:26 PM Tomas Vondra wrote: > Good. I admit I lost track of which the various regressions may affect > existing plans, and which are specific to the prefetch patch. As far as I know, we only have the following unambiguous performance regressions (that clearly need to be fi

Re: index prefetching

2025-08-14 Thread Thomas Munro
On Fri, Aug 15, 2025 at 1:47 PM Thomas Munro wrote: > (rather than introducing a secondary reference > counting scheme in the WAL that I think you might be describing?), and s/WAL/read stream/

Re: index prefetching

2025-08-14 Thread Thomas Munro
On Fri, Aug 15, 2025 at 11:21 AM Tomas Vondra wrote: > I don't recall all the details, but IIRC my impression was it'd be best > to do this "caching" entirely in the read_stream.c (so the next_block > callbacks would probably not need to worry about lastBlock at all), > enabled when creating the s

Re: index prefetching

2025-08-14 Thread Andres Freund
Hi, On 2025-08-14 17:55:53 -0400, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan wrote: > > > We can optimize that by deferring the StartBufferIO() if we're > > > encountering a > > > buffer that is undergoing IO, at the cost of some complexity. I'm not > > > sure > >

Re: index prefetching

2025-08-14 Thread Tomas Vondra
On 8/15/25 01:05, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 6:24 PM Tomas Vondra wrote: >> FWIW I'm not claiming this explains all odd things we're investigating >> in this thread, it's more a confirmation that the scan direction may >> matter if it translates to direction at the device lev

Re: index prefetching

2025-08-14 Thread Tomas Vondra
On 8/14/25 23:55, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan wrote: >> If this same mechanism remembered (say) the last 2 heap blocks it >> requested, that might be enough to totally fix this particular >> problem. This isn't a serious proposal, but it'll be simple en

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 6:24 PM Tomas Vondra wrote: > FWIW I'm not claiming this explains all odd things we're investigating > in this thread, it's more a confirmation that the scan direction may > matter if it translates to direction at the device level. I don't think > it can explain the strange

Re: index prefetching

2025-08-14 Thread Tomas Vondra
On 8/14/25 01:19, Andres Freund wrote: > Hi, > > On 2025-08-14 01:11:07 +0200, Tomas Vondra wrote: >> On 8/13/25 23:57, Peter Geoghegan wrote: >>> On Wed, Aug 13, 2025 at 5:19 PM Tomas Vondra wrote: It's also not very surprising this happens with backwards scans more. The I/O is apparen

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan wrote: > If this same mechanism remembered (say) the last 2 heap blocks it > requested, that might be enough to totally fix this particular > problem. This isn't a serious proposal, but it'll be simple enough to > implement. Hopefully when I do that

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 4:44 PM Andres Freund wrote: > Interesting. In the sequential case I see some waits that are not attributed > in explain, due to the waits happening within WaitIO(), not WaitReadBuffers(). > Which indicates that the read stream is trying to re-read a buffer that > previousl

Re: index prefetching

2025-08-14 Thread Andres Freund
Hi, On 2025-08-14 15:45:26 -0400, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 3:15 PM Peter Geoghegan wrote: > > Then why does the exact same pair of runs show "I/O Timings: shared > > read=194.629" for the sequential table backwards scan (with total > > execution time 1132.360 ms), versus "

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu Aug 14, 2025 at 3:41 PM EDT, Andres Freund wrote: > Hm, that is somewhat curious. > > I wonder if there's some wait time that's not being captured by "I/O > Timings". A first thing to do would be to just run strace --summary-only while > running the query, and see if there are syscall wait t

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 3:15 PM Peter Geoghegan wrote: > Then why does the exact same pair of runs show "I/O Timings: shared > read=194.629" for the sequential table backwards scan (with total > execution time 1132.360 ms), versus "I/O Timings: shared read=352.88" > (with total execution time 697.

Re: index prefetching

2025-08-14 Thread Andres Freund
Hi, On 2025-08-14 15:15:02 -0400, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 2:53 PM Andres Freund wrote: > > I think this is just an indicator of being IO bound. > > Then why does the exact same pair of runs show "I/O Timings: shared > read=194.629" for the sequential table backwards scan

Re: index prefetching

2025-08-14 Thread Andres Freund
Hi, On 2025-08-14 15:30:16 -0400, Peter Geoghegan wrote: > On Thu Aug 14, 2025 at 3:15 PM EDT, Peter Geoghegan wrote: > > On Thu, Aug 14, 2025 at 2:53 PM Andres Freund wrote: > >> I think this is just an indicator of being IO bound. > > > > Then why does the exact same pair of runs show "I/O Timi

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu Aug 14, 2025 at 3:15 PM EDT, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 2:53 PM Andres Freund wrote: >> I think this is just an indicator of being IO bound. > > Then why does the exact same pair of runs show "I/O Timings: shared > read=194.629" for the sequential table backwards scan

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu, Aug 14, 2025 at 2:53 PM Andres Freund wrote: > I think this is just an indicator of being IO bound. Then why does the exact same pair of runs show "I/O Timings: shared read=194.629" for the sequential table backwards scan (with total execution time 1132.360 ms), versus "I/O Timings: share

Re: index prefetching

2025-08-14 Thread Andres Freund
Hi, On 2025-08-14 14:44:44 -0400, Peter Geoghegan wrote: > On Thu Aug 14, 2025 at 1:57 PM EDT, Peter Geoghegan wrote: > > The only interesting thing about the flame graph is just how little > > difference there seems to be (at least for this particular perf event > > type). > > I captured method_i

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Thu Aug 14, 2025 at 1:57 PM EDT, Peter Geoghegan wrote: > The only interesting thing about the flame graph is just how little > difference there seems to be (at least for this particular perf event > type). I captured method_io_uring.c DEBUG output from running each query in the server log, in

Re: index prefetching

2025-08-14 Thread Peter Geoghegan
On Wed Aug 13, 2025 at 8:59 PM EDT, Tomas Vondra wrote: > On 8/14/25 01:50, Peter Geoghegan wrote: >> I first made the order of the table random, except among groups of index >> tuples >> that have exactly the same value. Those will still point to the same 1 or 2 >> heap >> blocks in virtually al

Re: index prefetching

2025-08-13 Thread Peter Geoghegan
On Wed Aug 13, 2025 at 7:50 PM EDT, Peter Geoghegan wrote: > pg@regression:5432 [2476413]=# EXPLAIN (ANALYZE ,costs off, timing off) > SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a desc; > ┌─┐ > │

Re: index prefetching

2025-08-13 Thread Peter Geoghegan
On Wed, Aug 13, 2025 at 8:59 PM Tomas Vondra wrote: > I investigated this from a different angle, by tracing the I/O request > generated. using perf-trace. And the patterns are massively different. I tried a similar approach myself, using a variety of tools. That didn't get me very far. > So, Q1

Re: index prefetching

2025-08-13 Thread Tomas Vondra
On 8/14/25 01:50, Peter Geoghegan wrote: > On Wed Aug 13, 2025 at 5:19 PM EDT, Tomas Vondra wrote: >> I did investigate this, and I don't think there's anything broken in >> read_stream. It happens because ReadStream has a concept of "ungetting" >> a block, which can happen after hitting some I/

Re: index prefetching

2025-08-13 Thread Thomas Munro
On Thu, Aug 14, 2025 at 9:19 AM Tomas Vondra wrote: > I did investigate this, and I don't think there's anything broken in > read_stream. It happens because ReadStream has a concept of "ungetting" > a block, which can happen after hitting some I/O limits. > > In that case we "remember" the last bl

Re: index prefetching

2025-08-13 Thread Peter Geoghegan
On Wed, Aug 13, 2025 at 7:51 PM Peter Geoghegan wrote: > Apparently random I/O is twice as fast as sequential I/O in descending order! > In > fact, this test case creates the appearance of random I/O being at least > slightly faster than sequential I/O for pages read in _ascending_ order! > > Obv

Re: index prefetching

2025-08-13 Thread Peter Geoghegan
On Wed Aug 13, 2025 at 5:19 PM EDT, Tomas Vondra wrote: > I did investigate this, and I don't think there's anything broken in > read_stream. It happens because ReadStream has a concept of "ungetting" > a block, which can happen after hitting some I/O limits. > > In that case we "remember" the last

Re: index prefetching

2025-08-13 Thread Tomas Vondra
On 8/13/25 23:36, Peter Geoghegan wrote: > On Wed, Aug 13, 2025 at 1:01 PM Tomas Vondra wrote: >> This seems rather bizarre, considering the two tables are exactly the >> same, except that in t2 the first column is negative, and the rows are >> fixed-length. Even heap_page_items says the tables ar

Re: index prefetching

2025-08-13 Thread Andres Freund
Hi, On 2025-08-14 01:11:07 +0200, Tomas Vondra wrote: > On 8/13/25 23:57, Peter Geoghegan wrote: > > On Wed, Aug 13, 2025 at 5:19 PM Tomas Vondra wrote: > >> It's also not very surprising this happens with backwards scans more. > >> The I/O is apparently much slower (due to missing OS prefetch),

Re: index prefetching

2025-08-13 Thread Tomas Vondra
On 8/13/25 23:57, Peter Geoghegan wrote: > On Wed, Aug 13, 2025 at 5:19 PM Tomas Vondra wrote: >> It's also not very surprising this happens with backwards scans more. >> The I/O is apparently much slower (due to missing OS prefetch), so we're >> much more likely to hit the I/O limits (max_ios

Re: index prefetching

2025-08-13 Thread Andres Freund
Hi, On 2025-08-14 00:23:49 +0200, Tomas Vondra wrote: > On 8/13/25 23:37, Andres Freund wrote: > > On 2025-08-13 23:07:07 +0200, Tomas Vondra wrote: > >> On 8/13/25 16:44, Andres Freund wrote: > >>> On 2025-08-13 14:15:37 +0200, Tomas Vondra wrote: > In fact, I believe this is about io_method

  1   2   3   4   >