On Fri, Jan 18, 2019 at 05:09:41PM -0800, Andres Freund wrote: > Hi, > > On 2019-01-18 19:57:03 -0500, Robert Haas wrote: > > On Fri, Jan 18, 2019 at 4:23 PM and...@anarazel.de <and...@anarazel.de> > > wrote: > > > My proposal for this was to attach a 'generation' to cache entries. Upon > > > access cache entries are marked to be of the current > > > generation. Whenever existing memory isn't sufficient for further cache > > > entries and, on a less frequent schedule, triggered by a timer, the > > > cache generation is increased and th new generation's "creation time" is > > > measured. Then generations that are older than a certain threshold are > > > purged, and if there are any, the entries of the purged generation are > > > removed from the caches using a sequential scan through the cache. > > > > > > This outline achieves: > > > - no additional time measurements in hot code paths > > > - no need for a sequential scan of the entire cache when no generations > > > are too old > > > - both size and time limits can be implemented reasonably cheaply > > > - overhead when feature disabled should be close to zero > > > > Seems generally reasonable. The "whenever existing memory isn't > > sufficient for further cache entries" part I'm not sure about. > > Couldn't that trigger very frequently and prevent necessary cache size > > growth? > > I'm thinking it'd just trigger a new generation, with it's associated > "creation" time (which is cheap to acquire in comparison to creating a > number of cache entries) . Depending on settings or just code policy we > can decide up to which generation to prune the cache, using that > creation time. I'd imagine that we'd have some default cache-pruning > time in the minutes, and for workloads where relevant one can make > sizing configurations more aggressive - or something like that.
OK, so it seems everyone likes the idea of a timer. The open questions are whether we want multiple epochs, and whether we want some kind of size trigger. With only one time epoch, if the timer is 10 minutes, you could expire an entry after 10-19 minutes, while with a new epoch every minute and 10-minute expire, you can do 10-11 minute precision. I am not sure the complexity is worth it. For a size trigger, should removal be effected by how many expired cache entries there are? If there were 10k expired entries or 50, wouldn't you want them removed if they have not been accessed in X minutes? In the worst case, if 10k entries were accessed in a query and never accessed again, what would the ideal cleanup behavior be? Would it matter if it was expired in 10 or 19 minutes? Would it matter if there were only 50 entries? -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +