On 2/5/19 11:05 PM, Alvaro Herrera wrote: > On 2019-Feb-05, Tomas Vondra wrote: > >> I don't think we need to remove the expired entries right away, if there >> are only very few of them. The cleanup requires walking the hash table, >> which means significant fixed cost. So if there are only few expired >> entries (say, less than 25% of the cache), we can just leave them around >> and clean them if we happen to stumble on them (although that may not be >> possible with dynahash, which has no concept of expiration) of before >> enlarging the hash table. > > I think seqscanning the hash table is going to be too slow; Ideriha-san > idea of having a dlist with the entries in LRU order (where each entry > is moved to head of list when it is touched) seemed good: it allows you > to evict older ones when the time comes, without having to scan the rest > of the entries. Having a dlist means two more pointers on each cache > entry AFAIR, so it's not a huge amount of memory. >
Possibly, although my guess is it will depend on the number of entries to remove. For small number of entries, the dlist approach is going to be faster, but at some point the bulk seqscan gets more efficient. FWIW this is exactly where a bit of benchmarking would help. >> So if we want to address this case too (and we probably want), we may >> need to discard the old cache memory context someho (e.g. rebuild the >> cache in a new one, and copy the non-expired entries). Which is a nice >> opportunity to do the "full" cleanup, of course. > > Yeah, we probably don't want to do this super frequently though. > Right. I've also realized the resizing is built into dynahash and is kinda incremental - we add (and split) buckets one by one, instead of immediately rebuilding the whole hash table. So yes, this would need more care and might need to interact with dynahash in some way. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services