On 06/19/2018 11:08 AM, Konstantin Knizhnik wrote:


On 18.06.2018 23:47, Andres Freund wrote:
On 2018-06-18 16:44:09 -0400, Robert Haas wrote:
On Sat, Jun 16, 2018 at 3:41 PM, Andres Freund <and...@anarazel.de> wrote:
The posix_fadvise approach is not perfect, no doubt about that. But it
works pretty well for bitmap heap scans, and it's about 13249x better
(rough estimate) than the current solution (no prefetching).
Sure, but investing in an architecture we know might not live long also
has it's cost. Especially if it's not that complicated to do better.
My guesses are:

- Using OS prefetching is a very small patch.
- Prefetching into shared buffers is a much bigger patch.
Why?  The majority of the work is standing up a bgworker that does
prefetching (i.e. reads WAL, figures out reads not in s_b, does
prefetch). Allowing a configurable number + some synchronization between
them isn't that much more work.

I do not think that prefetching in shared buffers requires much more efforts and make patch more envasive... It even somehow simplify it, because there is no to maintain own cache of prefetched pages... But it will definitely have much more impact on Postgres performance: contention for buffer locks, throwing away pages accessed by read-only queries,...

Also there are two points which makes prefetching into shared buffers more complex: 1. Need to spawn multiple workers to make prefetch in parallel and somehow distribute work between them. 2. Synchronize work of recovery process with prefetch to prevent prefetch to go too far and doing useless job. The same problem exists for prefetch in OS cache, but here risk of false prefetch is less critical.


I think the main challenge here is that all buffer reads are currently synchronous (correct me if I'm wrong), while the posix_fadvise() allows a to prefetch the buffers asynchronously.

I don't think simply spawning a couple of bgworkers to prefetch buffers is going to be equal to async prefetch, unless we support some sort of async I/O. Maybe something has changed recently, but every time I looked for good portable async I/O API/library I got burned.

Now, maybe a couple of bgworkers prefetching buffers synchronously would be good enough for WAL refetching - after all, we only need to prefetch data fast enough for the recovery not to wait. But I doubt it's going to be good enough for bitmap heap scans, for example.

We need a prefetch that allows filling the I/O queues with hundreds of requests, and I don't think sync prefetch from a handful of bgworkers can achieve that.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to