Hi, On 2021-05-14 16:53:16 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > In essence, debug_invalidate_system_caches_always=1 in some important > > aspects > > behaves like debug_invalidate_system_caches_always=3, due to the syscache > > involvement. > > Yeah. I think it's important to test those recursive invalidation > scenarios, but it could likely be done more selectively.
Agreed. I wonder if the logic could be something like indicating that we don't invalidate due to pg_class/attribute/am/... (a set of super common system catalogs) being opened, iff that open is at the "top level". So we'd e.g. not trigger invalidation for a syscache miss scanning pg_class, unless the miss happens during a relcache build. But we would continue to trigger invalidations without further checks if e.g. pg_subscription is opened. > > What about having a mode where each "nesting" level of SearchCatCacheMiss > > allows only one interior InvalidateSystemCaches()? > > An idea I'd been toying with was to make invals probabilistic, that is > there would be X% chance of an inval being forced at any particular > opportunity. Then you could dial X up or down to make a tradeoff > between speed and the extent of coverage you get from a single run. > (Over time, you could expect pretty complete coverage even with X > not very close to 1, I think.) > > This could be extended to what you're thinking about by reducing X > (according to some rule or other) for each level of cache-flush > recursion. The argument to justify that is that recursive cache > flushes are VERY repetitive, so that even a small probability will > add up to full coverage of those code paths fairly quickly. That'd make sense, I've been wondering about something similar. But I'm a bit worried about that making it harder to reproduce problems reliably? > I've not worked out the math to justify any specific proposal > along this line, though. FWIW, I've prototyped the idea of only invalidating once for each syscache level, and it does reduce runtime of CREATE TABLE blarg_{0,1,2,3}(id serial primary key); SET debug_invalidate_system_caches_always = 1; SELECT * FROM blarg_0 join blarg_1 USING (id) join blarg_2 using (id) JOIN blarg_3 USING(id); RESET ALL; from 7.5s to 4.7s. The benefits are smaller when fewer tables are accessed, and larger if more (surprising, right :)). Greetings, Andres Freund