On Tue, Aug 4, 2020 at 3:54 AM Konstantin Knizhnik <k.knizh...@postgrespro.ru> wrote: > So in this thread three solutions are proposed: > 1. "bullet-proof general shared invalidation" > 2. recovery-only solution avoiding shared memory and invalidation > 3. "relaxed" shared memory cache with simplified invalidation
Hi Konstantin, By the way, point 2 is now committed (c5315f4f). As for 1 vs 3, I wasn't proposing two different invalidation techniques: in both approaches, I'm calling the cached values "relaxed", meaning that their freshness is controlled by memory barriers elsewhere that the caller has to worry about. I'm just suggesting for idea 3 that it might be a good idea to use relaxed values only in a couple of hot code paths where we do the analysis required to convince ourselves that memory barriers are already in the right places to make it safe. By "bullet-proof" I meant that we could in theory convince ourselves that *all* users of smgrnblocks() can safely use relaxed values, but that's hard. That said, the sketch patch I posted certainly needs more work, and maybe someone has a better idea on how to do it. > If solving such very important by still specific problem of caching > relation size requires so much efforts, > then may be it is time to go further in the direction towards shared > catalog? I wouldn't say it requires too much effort, at least the conservative approach (3). But I also agree with what you're saying, in the long term: > This shared relation cache can easily store relation size as well. > In addition it will solve a lot of other problems: > - noticeable overhead of local relcache warming > - large memory consumption in case of larger number of relations > O(max_connections*n_relations) > - sophisticated invalidation protocol and related performance issues > Certainly access to shared cache requires extra synchronization.But DDL > operations are relatively rare. > So in most cases we will have only shared locks. May be overhead of > locking will not be too large? Yeah, I would be very happy if we get a high performance shared sys/rel/plan/... caches in the future, and separately, having the relation size available in shmem is something that has come up in discussions about other topics too (tree-based buffer mapping, multi-relation data files, ...). I agree with you that our cache memory usage is a big problem, and it will be great to fix that one day. I don't think that should stop us from making small improvements to the existing design in the meantime, though. "The perfect is the enemy of the good." Look at all this trivial stuff: https://wiki.postgresql.org/wiki/Syscall_Reduction I don't have high quality data to report yet, but from simple tests I'm seeing orders of magnitude fewer syscalls per pgbench transaction in recovery, when comparing 11 to 14devel, due to various changes. Admittedly, the size probes in regular backends aren't quite as bad as the recovery ones were, because they're already limited by the rate you can throw request/response cycles at the DB, but there are still some cases like planning for high-partition counts and slow filesystems that can benefit measurably from caching.