On Mon, Jan 21, 2013 at 1:52 AM, Jeff Davis <pg...@j-davis.com> wrote: >> Of course, there is an argument that this patch will >> simplify the code, but I'm not sure if its enough to justify the >> additional contention which may or may not show up in the benchmarks >> we are running, but we know its there. > > What additional contention? How do you know it's there?
We know it's there because every additional page that you access has to be looked up and locked and the memory that contains it brought into the CPU cache. If you look up and lock more pages, you WILL have more contention - both for memory, and for locks - and more runtime cost. Whether or not you can currently detect that contention and/or runtime cost experimentally is another matter. Failure to do so could indicate either that you haven't got the right test case - Heikki seems to have found one that works - or that it's being masked by other things, but might not be always. To raise an example which I believe is similar to this one, consider Kevin Grittner's work on SSI. He set himself a goal that SSI should impose a performance regression of no more than 2% - and he met that goal, at the time the code was committed. But then, my read scalability work during the 9.2 cycle blew the lid off of read performance for certain workloads, and now SSI is FAR slower on those workloads. His code always had a scalability problem, but you couldn't measure it experimentally before I did that work, because there were equally bad scalability problems elsewhere. Now there aren't, and you can, and I have. We might well have refused to commit that patch with the locking scheme it uses today if my scalability work had been done first - or perhaps we would have decided that the case where the difference is large is too narrow to be worth worrying about, but I think it would have at least provoked discussion. My biggest concern about the visibility map is that it will act as a focal point for contention. Map half a gigabyte of heap down to 1 page of visibility map and it's easy to think that you could have some situation in which that page gets very, very hot. We know that cache line stealing is a significant cost of ProcArray manipulation, which is why the PGXACT patch makes a significant difference in write throughput. Now your argument seems to be that we can't measure any such effect here, but maybe we're just not measuring the right thing. Heikki's example reminded me that I was concerned about the cost of repeatedly pinning different VM buffers during an index-only scan, if the relation were very large. That's seems easy to justify on the grounds that you're saving so much I/O and memory traffic that the pins will seem cheap by comparison ... but they don't, always. IIRC, an index-only scan on a fully-cached relation is not necessarily faster than a regular index-scan, even if the heap is entirely all-visible. I realize these examples aren't directly applicable to this case, so I'm not sure if they help to advance the discussion or not. I advance them only as evidence that simple tests don't always manage to capture the real costs and benefits of a feature or optimization, and that predicting the system-wide effects of changes in this area is hard. For a large benefit I think we would perhaps bite the bullet and take our chances, but in my (human and imperfect) judgement the benefit here is not large enough to justify it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers