On Fri, Sep 29, 2023 at 8:50 PM Peter Geoghegan <p...@bowt.ie> wrote: > While pgbench makes a fine stress-test, for the most part its workload > is highly unrealistic. And yet we seem to think that it's just about > the most important benchmark of all. If we're not willing to get over > even small regressions in pgbench, I fear we'll never make significant > progress in this area. It's at least partly a cultural problem IMV.
I think it's true that the fact that pgbench does what pgbench does makes us think more about that workload than about some other, equally plausible workload. It's the test we have, so we end up running it a lot. If we had some other test, we'd run that one. But I don't think I buy that it's "highly unrealistic." It is true that pgbench with default options is limited only by the speed of the database, while real update-heavy workloads are typically limited by something external, like the number of users hitting the web site that the database is backing. It's also true that real workloads tend to involve some level of non-uniform access. But I'm not sure that either of those things really matter that much in the end. The problem I have with the external rate-limiting argument is that it ignores hardware selection, architectural decision-making, and workload growth. Sure, people are unlikely to stand up a database that can do 10,000 transactions per second and hit it with a workload that requires doing 20,000 transactions per second, because they're going to find out in testing that it doesn't work. Then they will either buy more hardware or rearchitect the system to reduce the required number of transactions per second or give up on using PostgreSQL. So when they do put it into production, it's unlikely to be overloaded on day one. But that's just because all of the systems that would have been overloaded on day one never make it to day one. They get killed off or changed before they get there. So it isn't like higher maximum throughput wouldn't have been beneficial. And over time, systems that initially started out not running maxed out can and, in my experience, fairly often do end up maxed out because once you've got the thing in production, it's hard to change anything, and load often does grow over time. As for non-uniform access, that is real and does matter, but there are certainly installations with tables where no rows survive long enough to need freezing, either because the table is regularly emptied, or just because the update load is high enough to hit all the rows fairly quickly. Maybe I'm misunderstanding your point here, in which case all of the above may be irrelevant. But my feeling is that we can't simply ignore cases where all/many rows are short-lived and say, well, those are unrealistic, so let's just freeze everything super-aggressively and that should be fine. I don't think that's OK. We can (and I think we should) treat that situation as a special case rather than as the typical case, but I think it would be a bad idea to dismiss it as a case that we don't need to worry about at all. -- Robert Haas EDB: http://www.enterprisedb.com