On 19.06.2018 14:03, Tomas Vondra wrote:
On 06/19/2018 11:08 AM, Konstantin Knizhnik wrote:
On 18.06.2018 23:47, Andres Freund wrote:
On 2018-06-18 16:44:09 -0400, Robert Haas wrote:
On Sat, Jun 16, 2018 at 3:41 PM, Andres Freund <and...@anarazel.de>
wrote:
The posix_fadvise approach is not perfect, no doubt about that.
But it
works pretty well for bitmap heap scans, and it's about 13249x
better
(rough estimate) than the current solution (no prefetching).
Sure, but investing in an architecture we know might not live long
also
has it's cost. Especially if it's not that complicated to do better.
My guesses are:
- Using OS prefetching is a very small patch.
- Prefetching into shared buffers is a much bigger patch.
Why? The majority of the work is standing up a bgworker that does
prefetching (i.e. reads WAL, figures out reads not in s_b, does
prefetch). Allowing a configurable number + some synchronization
between
them isn't that much more work.
I do not think that prefetching in shared buffers requires much more
efforts and make patch more envasive...
It even somehow simplify it, because there is no to maintain own
cache of prefetched pages...
But it will definitely have much more impact on Postgres performance:
contention for buffer locks, throwing away pages accessed by
read-only queries,...
Also there are two points which makes prefetching into shared buffers
more complex:
1. Need to spawn multiple workers to make prefetch in parallel and
somehow distribute work between them.
2. Synchronize work of recovery process with prefetch to prevent
prefetch to go too far and doing useless job.
The same problem exists for prefetch in OS cache, but here risk of
false prefetch is less critical.
I think the main challenge here is that all buffer reads are currently
synchronous (correct me if I'm wrong), while the posix_fadvise()
allows a to prefetch the buffers asynchronously.
Yes, this is why we have to spawn several concurrent background workers
to perfrom prefetch.
I don't think simply spawning a couple of bgworkers to prefetch
buffers is going to be equal to async prefetch, unless we support some
sort of async I/O. Maybe something has changed recently, but every
time I looked for good portable async I/O API/library I got burned.
Now, maybe a couple of bgworkers prefetching buffers synchronously
would be good enough for WAL refetching - after all, we only need to
prefetch data fast enough for the recovery not to wait. But I doubt
it's going to be good enough for bitmap heap scans, for example.
We need a prefetch that allows filling the I/O queues with hundreds of
requests, and I don't think sync prefetch from a handful of bgworkers
can achieve that.
regards
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company