On 1/14/14, 6:36 PM, Claudio Freire wrote:
On Tue, Jan 14, 2014 at 9:22 PM, Jim Nasby <j...@nasby.net> wrote:
On 1/14/14, 11:30 AM, Jeff Janes wrote:
I think the "reclaim this page if you need memory but leave it resident if
there is no memory pressure" hint would be more useful for temporary working
files than for what was being discussed above (shared buffers). When I do
work that needs large temporary files, I often see physical write IO spike
but physical read IO does not. I interpret that to mean that the temporary
data is being written to disk to satisfy either dirty_expire_centisecs or
dirty_*bytes, but the data remains in the FS cache and so disk reads are not
needed to satisfy it. So a hint that says "this file will never be fsynced
so please ignore dirty_*bytes and dirty_expire_centisecs. I will need it
again relatively soon (but not after a reboot), but will do so mostly
sequentially, so please don't evict this without need, but if you do need to
then it is a good candidate" would be good.
I also frequently see this, and it has an even larger impact if pgsql_tmp is
on the same filesystem as WAL. Which *theoretically* shouldn't matter with a
BBU controller, except that when the kernel suddenly decides your
*temporary* data needs to hit the media you're screwed.
Though, it also occurs to me... perhaps it would be better for us to simply
map temp objects to memory and let the kernel swap them out if needed...
Oum... bad idea.
Swap logic has very poor taste for I/O patterns.
Well, to be honest, so do we. Practically zero in fact...
In fact, the kernel might even be in a better position than we are since you
can presumably count page faults much more cheaply than we can.
BTW, if you guys are looking at ARC you should absolutely read discussion about
that in our archives (http://lnk.nu/postgresql.org/2zeu/ as a starting point).
We put considerable effort into it, had it in two minor versions, and then
switched to a clock-sweep algorithm that's similar to what FreeBSD used, at
least in the 4.x days. Definitely not claiming what we've got is the best (in
fact, I think we're hurt by not maintaining a real free list), but the ARC info
there is probably valuable.
--
Jim C. Nasby, Data Architect j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers