sollhui opened a new pull request, #61273:
URL: https://github.com/apache/doris/pull/61273
### Problem
In the LRU-K implementation, when LRUCache::erase() is called, it removes
the key from the main cache hash table but does not remove it from the visits
list (_visits_lru_cache_map / _visits_lru_cache_list):
void LRUCache::erase(const CacheKey& key, uint32_t hash) {
...
e = _table.remove(key, hash); // removed from main cache
...
// visits list is NOT cleaned up ← missing
}
This matters when a segment is accessed exactly once (enters the visits
list, not yet promoted to main cache) and then gets erased before its second
access — the typical scenario being compaction: when old rowsets are merged,
SegmentLoader::erase_segments() is called for all segments of the old rowset.
Timeline:
1. Segment S accessed once → enters visits list, not in main cache
2. Compaction merges the rowset containing S
3. erase(S) called → S removed from main cache (no-op, wasn't there)
→ S's visits list entry remains ← stale
4. visits list entry for S occupies _visits_lru_cache_usage indefinitely
until it's evicted by LRU pressure from newer entries
The visits list capacity is bounded by _capacity (same as main cache, ~1.47
GB for SegmentCache). Stale entries accumulate and reduce the effective
tracking window for legitimate segments waiting to be promoted, slightly
increasing miss rate under compaction-heavy workloads.
### Fix
In LRUCache::erase(), after removing the entry from the main cache, also
check the visits list and remove the entry if present:
if (_is_lru_k) {
auto it = _visits_lru_cache_map.find(key.to_string());
if (it != _visits_lru_cache_map.end()) {
_visits_lru_cache_usage -= it->second->second;
_visits_lru_cache_list.erase(it->second);
_visits_lru_cache_map.erase(it);
}
}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]