[ https://issues.apache.org/jira/browse/SOLR-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17650990#comment-17650990 ]
Michael Gibney commented on SOLR-15859: --------------------------------------- I think it might be possible (and preferable?) to implement this as a custom {{SolrCache<K, V>}} implementation that wraps {{solr.CaffeineCache<K, MetadataWrapper<V>>}}. I think [~ben.manes] was alluding to something like this "MetadataWrapper" approach in his [comment above|#comment-17633401]. I've actually done something similar, and it can work quite well. It can be a bit tricky, but I think the "per-entry stats" part would be pretty straightforward done this way, and I really like the idea of implementing this functionality without modifying the hot path of what's currently the default/only cache implementation bundled with Solr. I think the only necessary modification to the existing {{solr.CaffeineCache}} class would be to provide a hook to actually dump the values, e.g., add them to a provided map, or something (so as not to actually expose the internals)? I do think the functionality you're pursuing with this could be useful. One benefit of implementing as I'm suggesting above, I think this functionality would be almost entirely pluggable (as in, plugins) -- aside from some interface for actually dumping a snapshot of the contents of the cache, which I suspect would indeed need a public method added to {{solr.CaffeineCache}}. I would definitely recommend avoiding top-level {{synchronized (cache)}} -- and I don't think that would be necessary if pursuing the "wrapping" approach. Maybe a more tightly-scoped change that ignores for now the request handler and stats tracking, and instead focuses on figuring out a clean (if perhaps experimental?) method/interface for dumping the contents of {{solr.CaffeineCache}}? I suspect that would be easier to merge with confidence, and would open the door to iterate on different ways of achieving some of the more nuanced functionality. > Add handler to dump filter cache > -------------------------------- > > Key: SOLR-15859 > URL: https://issues.apache.org/jira/browse/SOLR-15859 > Project: Solr > Issue Type: Improvement > Reporter: Andy Lester > Assignee: Shawn Heisey > Priority: Major > Labels: FQ, cache, filtercache, metrics > Attachments: cacheinfo-1.patch, cacheinfo-2.patch, cacheinfo.patch, > fix_92_startup.patch > > > It would be very helpful to be able to inspect the contents of the > filterCache. > I'd like to be able to query something like > {{/admin/caches?type=filter&nentries=1000&sort=numHits+DESC}} > nentries would be allowed to be -1 to get everything. > It would be nice to see these data items for each entry. I don't know which > are available, but I'm thinking blue sky here: > * cache key, exactly as stored > * Timestamp when the entry was inserted > * Whether the insertion of the entry evicted another entry, and if so which > one > * Timestamp of when this entry was last hit > * Number of hits on this entry forever > * Number of hits on this entry over some time period > * Number of documents matched by the filter > * Number of bytes of memory used by the filter > These are the sorts of questions I'd like to be able answer: > * "I just did a query that I expect will have added a cache entry. Did it?" > * "Are my queries hitting existing cache entries?" > * "How big should I set my filterCache size? Should I limit it by number of > entries or RAM usage?" > * "Which of my FQs are getting used the most? These are the ones I want in > my firstSearcher queries." (I currently determine this by processing my old > solr logs) > * "Which filters give me the most bang for the buck in terms of RAM usage?" > * "I have filter X and filter Y, but would it be beneficial if I made a > filter X AND Y?" > * "Which FQs are used more at certain times of the day? (Assuming I take > regular snapshots throughout the day)" > I imagine a response might look like: > {{{}} > {{ "responseHeader": {}} > {{ "status": 0,}} > {{ "QTime": 961}} > {{ },}} > {{ "response": {}} > {{ "numFound": 12104,}} > {{ "filterCacheKeys": {}} > {{ [}} > {{ "language:eng": {}} > {{ "inserted": "2021-12-04T07:34:16Z",}} > {{ "lastHit": "2021-12-04T18:17:43Z",}} > {{ "numHits": 15065,}} > {{ "numHitsInPastHour": 2319,}} > {{ "evictedKey": "agelevel:4 shippable:Y",}} > {{ "numRecordsMatchedByFilter": 24328753,}} > {{ "bytesUsed": 3041094}} > {{ }}} > {{ ],}} > {{ [}} > {{ "is_set:N": {}} > {{ ...}} > {{ }}} > {{ ],}} > {{ [}} > {{ "language:spa": {}} > {{ ...}} > {{ }}} > {{ ]}} > {{ }}} > {{}}} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org