kotman12 commented on PR #2382: URL: https://github.com/apache/solr/pull/2382#issuecomment-2078142747
> So I was trying to learn how the main configuration bits fit together here and high-level the reverse search idea and my _solr-monitor-naive-dinner-demo_ branch (or #2421 diff) off this pull request's branch is a side effect of that and my understanding so far based on it is that: > > * the in-memory state is in the `Presearcher` object in the `ReverseQueryParserPlugin` class object (and in the _solr-monitor-naive-dinner-demo_ i just used a simple `Monitor` object instead of the `Presearcher` object) > * the state is updated via the `MonitorUpdateRequestProcessor` i.e. saved searches are added as `MonitorQuery` objects to the `Monitor` object (and updating of the `Presearcher` object is a bit different) > * the state is accessed via the `ReverseSearchComponent` component (currently non-distributed but conceptually distributed would work too?) > > Is that basic understanding correct? As a next step I might go learn more about the `Presearcher` itself. I'll give the PR a look but when I first looked at this my main concerns wiring a Monitor straight into solr were: 1. Handling commit/rollback and what to update the tlog with if you also writing to a "sidecar" monitor object? 2. Handling persistence. Currently the Monitor has its own tightly sealed index. It can be configured for persistence but if you want to peek at the segments a monitor is writing to disk it might not be easy, especially to handle configurations like tlog+pull. The alternative is to use only the in-memory Monitor configurations but that has limitations and takes away precious resources from the {cacheId -> deserialized query} cache. 3. Bringing me to my final point that the cache a Monitor object wraps is a simple concurrentHashMap which is updated with a very coarse-grained lock that can block reads for a long time. It just doesn't feel like it "jives" with the solr approach to concurrency that is much more sophisticated (it is a fully fledged db after all). We could make the Monitor cache more configurable in the upstream lucene monitor repo but in my opinion lucene monitor tries to do too much state-management that its not that good at but the most valuable thing to take advantage of is the sophisticated reverse search methods (query decomposition for faster matching, query tokenization for pre-search, term weighting, optimized document-to-query conversion with term-acceptor, and probably something else I am forgetting). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org