Re: Eager page freeze criteria clarification

Robert Haas Mon, 02 Oct 2023 06:26:17 -0700

On Fri, Sep 29, 2023 at 8:50 PM Peter Geoghegan <p...@bowt.ie> wrote:
> While pgbench makes a fine stress-test, for the most part its workload
> is highly unrealistic. And yet we seem to think that it's just about
> the most important benchmark of all. If we're not willing to get over
> even small regressions in pgbench, I fear we'll never make significant
> progress in this area. It's at least partly a cultural problem IMV.


I think it's true that the fact that pgbench does what pgbench does
makes us think more about that workload than about some other, equally
plausible workload. It's the test we have, so we end up running it a
lot. If we had some other test, we'd run that one.

But I don't think I buy that it's "highly unrealistic." It is true
that pgbench with default options is limited only by the speed of the
database, while real update-heavy workloads are typically limited by
something external, like the number of users hitting the web site that
the database is backing. It's also true that real workloads tend to
involve some level of non-uniform access. But I'm not sure that either
of those things really matter that much in the end.

The problem I have with the external rate-limiting argument is that it
ignores hardware selection, architectural decision-making, and
workload growth. Sure, people are unlikely to stand up a database that
can do 10,000 transactions per second and hit it with a workload that
requires doing 20,000 transactions per second, because they're going
to find out in testing that it doesn't work. Then they will either buy
more hardware or rearchitect the system to reduce the required number
of transactions per second or give up on using PostgreSQL. So when
they do put it into production, it's unlikely to be overloaded on day
one. But that's just because all of the systems that would have been
overloaded on day one never make it to day one. They get killed off or
changed before they get there. So it isn't like higher maximum
throughput wouldn't have been beneficial. And over time, systems that
initially started out not running maxed out can and, in my experience,
fairly often do end up maxed out because once you've got the thing in
production, it's hard to change anything, and load often does grow
over time.

As for non-uniform access, that is real and does matter, but there are
certainly installations with tables where no rows survive long enough
to need freezing, either because the table is regularly emptied, or
just because the update load is high enough to hit all the rows fairly
quickly.

Maybe I'm misunderstanding your point here, in which case all of the
above may be irrelevant. But my feeling is that we can't simply ignore
cases where all/many rows are short-lived and say, well, those are
unrealistic, so let's just freeze everything super-aggressively and
that should be fine. I don't think that's OK. We can (and I think we
should) treat that situation as a special case rather than as the
typical case, but I think it would be a bad idea to dismiss it as a
case that we don't need to worry about at all.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Eager page freeze criteria clarification

Reply via email to