Gregory Stark wrote:
I want to know if we're interested in the more invasive patch restructuring
the buffer manager. My feeling is that we probably are eventually. But I
wonder if people wouldn't feel more comfortable taking baby steps at first
which will have less impact in cases where it's not being heavily used.

It's a lot more work to do the invasive patch and there's not much point in
doing it if we're not interested in taking it when it's ready.

I like the simplicity of this patch. My biggest worry is the impact on performance when the posix_fadvise calls are not helping. I'd like to see some testing of that. How expensive are the extra bufmgr hash table lookups, compared to all the other stuff that's involved in a bitmap heap scan?

Another thing I wonder is whether this approach can easily be extended to other types of scans than bitmap heap scans. Normal index scans seem most interesting; they can generate a lot of random I/O. I don't see any big problems there, except again the worst-case performance: If you issue AdviseBuffer calls for all the heap tuples in an index page, in the naive way, you can issue hundreds of posix_fadvise calls. But what if they're all actually on the same page? Surely you only want to go through the buffer manager once (like we do in the scan, thanks to ReleaseAndReadBuffer()) in that case, and only call posix_fadvise once. But again, maybe you can convince me that it's a non-issue by some benchmarking.

Here's an updated patch. It's mainly just a CVS update but it also includes
some window dressing: an autoconf test, some documentation, and some fancy
arithmetic to auto-tune the amount of prefetching based on a more real-world
parameter "effective_spindle_count".

I like the "effective_spindle_count". That's very intuitive.

The formula to turn that into preread_pages looks complex, though. I can see the reasoning behind it, but I doubt thing behave that beautifully in the real world. (That's not important right now, we have plenty of time to tweak that after more testing.)

It also includes printing out how much
the prefetching target got ramped up to in the explain output -- though I'm
not sure we really want that in the tree, but it's nice for performance
testing.

I don't understand the ramp up logic. Can you explain what that's for and how it works?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to