On 9/20/24 16:21, Peter Geoghegan wrote: > On Fri, Sep 20, 2024 at 9:45 AM Tomas Vondra <to...@vondra.me> wrote: >> 3) restart cluster, drop caches >> >> 4) run the query from the SQL script >> >> I suspect you don't do (3). I didn't mention this explicitly, my message >> only said "with uncached data", so maybe that's the problem? > > You're right that I didn't do step 3 here. I'm generally in the habit > of using fully cached data when testing this kind of work. > > The only explanation I can think of is that (at least on your > hardware) OS readahead helps the master branch more than skipping > helps the patch. That's surprising, but I guess it's possible here > because skip scan only needs to access about every third page. And > because this particular index was generated by CREATE INDEX, and so > happens to have a strong correlation between key space order and > physical block order. And probably because this is an index-only scan. >
Good idea. Yes, it does seem to be due to readahead - if I disable that, the query takes ~320ms on master and ~280ms with the patch. >> I wasn't suggesting it's a sympathetic case for skipscan. My point is >> that it perfectly matches the costing assumptions, i.e. columns are >> independent etc. But if it's not sympathetic, maybe the cost shouldn't >> be 1/5 of cost from master? > > The costing is pretty accurate if we assume cached data, though -- > which is what the planner will actually assume. In any case, is that > really the only problem you see here? That the costing might be > inaccurate because it fails to account for some underlying effect, > such as the influence of OS readhead? > > Let's assume for a moment that the regression is indeed due to > readahead effects, and that we deem it to be unacceptable. What can be > done about it? I have a really hard time thinking of a fix, since by > most conventional measures skip scan is indeed much faster here. > It does seem to be due to readahead, and the costing not accounting for these effects. And I don't think it's unacceptable - I don't think we consider readahead elsewhere, and it certainly is not something I'd expect this patch to fix. So I think it's fine. Ultimately, I think this should be "fixed" by explicitly prefetching pages. My index prefetching patch won't really help, because AFAIK this is about index pages. And I don't know how feasible it is. regards -- Tomas Vondra