On Mon, 2014-01-13 at 14:32 -0600, Jim Nasby wrote: > On 1/13/14, 2:27 PM, Claudio Freire wrote: > > On Mon, Jan 13, 2014 at 5:23 PM, Jim Nasby <j...@nasby.net> wrote: > >> On 1/13/14, 2:19 PM, Claudio Freire wrote: > >>> > >>> On Mon, Jan 13, 2014 at 5:15 PM, Robert Haas <robertmh...@gmail.com> > >>> wrote: > >>>> > >>>> On a related note, there's also the problem of double-buffering. When > >>>> we read a page into shared_buffers, we leave a copy behind in the OS > >>>> buffers, and similarly on write-out. It's very unclear what to do > >>>> about this, since the kernel and PostgreSQL don't have intimate > >>>> knowledge of what each other are doing, but it would be nice to solve > >>>> somehow. > >>> > >>> > >>> > >>> There you have a much harder algorithmic problem. > >>> > >>> You can basically control duplication with fadvise and WONTNEED. The > >>> problem here is not the kernel and whether or not it allows postgres > >>> to be smart about it. The problem is... what kind of smarts > >>> (algorithm) to use. > >> > >> > >> Isn't this a fairly simple matter of when we read a page into shared > >> buffers > >> tell the kernel do forget that page? And a corollary to that for when we > >> dump a page out of shared_buffers (here kernel, please put this back into > >> your cache). > > > > > > That's my point. In terms of kernel-postgres interaction, it's fairly > > simple. > > > > What's not so simple, is figuring out what policy to use. Remember, > > you cannot tell the kernel to put some page in its page cache without > > reading it or writing it. So, once you make the kernel forget a page, > > evicting it from shared buffers becomes quite expensive. > > Well, if we were to collaborate with the kernel community on this then > presumably we can do better than that for eviction... even to the > extent of "here's some data from this range in this file. It's (clean| > dirty). Put it in your cache. Just trust me on this."
This should be the madvise() interface (with MADV_WILLNEED and MADV_DONTNEED) is there something in that interface that is insufficient? James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers