Hi, On 2019-01-15 13:32:36 -0500, Tom Lane wrote: > Well, we *had* an LRU mechanism for the catcaches way back when. We got > rid of it --- see commit 8b9bc234a --- because (a) maintaining the LRU > info was expensive and (b) performance fell off a cliff in scenarios where > the cache size limit was exceeded. You could probably find some more info > about that by scanning the mail list archives from around the time of that > commit, but I'm too lazy to do so right now. > > That was a dozen years ago, and it's possible that machine performance > has moved so much since then that the problems are gone or mitigated. > In particular I'm sure that any limit we would want to impose today will > be far more than the 5000-entries-across-all-caches limit that was in use > back then. But I'm not convinced that a workload that would create 100K > cache entries in the first place wouldn't have severe problems if you > tried to constrain it to use only 80K entries.
I think that'd be true if you the accesses were truly randomly distributed - but that's not the case in the cases where I've seen huge caches. It's usually workloads that have tons of functions, partitions, ... and a lot of them are not that frequently accessed, but because we have no cache purging mechanism stay around for a long time. This is often exascerbated by using a pooler to keep connections around for longer (which you have to, to cope with other limits of PG). > As far as the relcache goes, we've never had a limit on that, but there > are enough routine causes of relcache flushes --- autovacuum for instance > --- that I'm not really convinced relcache bloat can be a big problem in > production. It definitely is. > The plancache has never had a limit either, which is a design choice that > was strongly influenced by our experience with catcaches. This sounds a lot of having learned lessons from one bad implementation and using it far outside of that situation. Greetings, Andres Freund