Jonah H. Harris wrote:
fadvise is a kludge.

I don't think it's a kludge at all. posix_fadvise() is a pretty nice and clean interface to hint the kernel what pages you're going to access in the near future. I can't immediately come up with a cleaner interface to do that.

Compared to async I/O, it's helluva lot simpler to add a few posix_fadvise() calls to an application, than switch to a completely different paradigm. And while posix_fadvise() is just a hint, allowing the OS to prioritize accordingly, all async I/O requests look the same.

 While it will help, it still makes us completely
reliant on the OS.

That's not a bad thing in my opinion. The OS knows the I/O hardware, disk layout, utilization, and so forth, and is in a much better position to do I/O scheduling than a user process. The only advantage a user process has is that it knows better what pages it's going to need, and posix_fadvise() is a good interface to let the user process tell the kernel that.

IIRC, we currently have support for rings in the buffer pool, which we could 
read
directly into.

The rings won't help you a bit. It's just a different way to choose victim buffers.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to