On Fri, Jun 3, 2016 at 2:09 AM, Andres Freund <and...@anarazel.de> wrote: > On 2016-06-03 01:57:33 -0400, Noah Misch wrote: >> > Which means that transactional workloads that are bigger than the OS >> > memory, or which have a non-uniform distribution leading to some >> > locality, are likely to be faster. In practice those are *hugely* more >> > likely than the uniform distribution that pgbench has. >> >> That is formally true; non-benchmark workloads rarely issue uniform writes. >> However, enough non-benchmark workloads have too little locality to benefit >> from caches. Those will struggle against *_flush_after like uniform writes >> do, so discounting uniform writes wouldn't simplify this project. > > But such workloads rarely will hit the point of constantly re-dirtying > already dirty pages in kernel memory within 30s.
I don't know why not. It's not exactly uncommon to update the same data frequently, nor is it uncommon for the hot data set to be larger than shared_buffers and smaller than the OS cache, even significantly smaller. Any workload of that type is going to have this problem regardless of whether the access pattern is uniform. If you have a highly non-uniform access pattern then you just have this problem on the small subset of the data that is hot. I think that asserting that there's something wrong with this test is just wrong. Many people have done many tests very similar to this one on Linux systems over many years to assess PostgreSQL performance. It's a totally legitimate test configuration. Indeed, I'd argue that this is actually a pretty common real-world scenario. Most people's hot data fits in memory, because if it doesn't, their performance sucks so badly that they either redesign something or buy more memory until it does. Also, most people have more hot data than shared_buffers. There are some who don't because their data set is very small, and that's nice when it happens; and there are others who don't because they carefully crank shared_buffers up high enough that everything fits, but most don't, either because it causes other problems, or because they just don't think to tinkering with it, or because they set it up that way initially but then the data grows over time. There are a LOT of people running with 8GB or less of shared_buffers and a working set that is in the tens of GB. Now, what varies IME is how much total RAM there is in the system and how frequently they write that data, as opposed to reading it. If they are on a tightly RAM-constrained system, then this situation won't arise because they won't be under the dirty background limit. And if they aren't writing that much data then they'll be fine too. But even putting all of that together I really don't see why you're trying to suggest that this is some bizarre set of circumstances that should only rarely happen in the real world. I think it clearly does happen, and I doubt it's particularly uncommon. If your testing didn't discover this scenario, I feel rather strongly that that's an oversight in your testing rather than a problem with the scenario. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers