[GitHub] [solr] dsmiley commented on a change in pull request #592: SOLR-14765: Optimize DocList creation by skipping sort for sort-irrelevant cases

GitBox Thu, 31 Mar 2022 04:03:46 -0700


dsmiley commented on a change in pull request #592:
URL: https://github.com/apache/solr/pull/592#discussion_r839472975




##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -935,22 +956,83 @@ private DocSet getAndCacheDocSet(Query query) throws 
IOException {
     return filterCache.computeIfAbsent(query, q -> getDocSetNC(q, null));
   }
 
-  private static Query matchAllDocsQuery = new MatchAllDocsQuery();
-  private volatile BitDocSet liveDocs;
+  private static final MatchAllDocsQuery MATCH_ALL_DOCS_QUERY = new 
MatchAllDocsQuery();
+
+  /**
+   * A naively cached canonical `liveDocs` DocSet. This does not need to be 
volatile. It may be set
+   * multiple times, but should always be set to the same value, as all set 
values should pass
+   * through `liveDocsCache.computeIfAbsent`

Review comment:
       obsolete comment

##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -118,10 +121,23 @@
   private final int queryResultMaxDocsCached;
   private final boolean useFilterForSortedQuery;
 
+  /**
+   * Special-case cache, mainly used as a synchronization point to handle the 
lazy-init of {@link
+   * #liveDocs}.
+   */
+  private final BitDocSet[] liveDocsCache = new BitDocSet[1];

Review comment:
       This can replaced with `Object liveDocsSync = new Object()`; no?
   Furthermore, declare it adjacent to liveDocs; keep things together/organized.

##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -935,22 +956,83 @@ private DocSet getAndCacheDocSet(Query query) throws 
IOException {
     return filterCache.computeIfAbsent(query, q -> getDocSetNC(q, null));
   }
 
-  private static Query matchAllDocsQuery = new MatchAllDocsQuery();
-  private volatile BitDocSet liveDocs;
+  private static final MatchAllDocsQuery MATCH_ALL_DOCS_QUERY = new 
MatchAllDocsQuery();
+
+  /**
+   * A naively cached canonical `liveDocs` DocSet. This does not need to be 
volatile. It may be set
+   * multiple times, but should always be set to the same value, as all set 
values should pass
+   * through `liveDocsCache.computeIfAbsent`
+   */
+  private BitDocSet liveDocs;
+
+  private static final BitDocSet EMPTY = new BitDocSet(new FixedBitSet(0), 0);
+
+  private BitDocSet computeLiveDocs() {
+    switch (leafContexts.size()) {
+      case 0:
+        assert numDocs() == 0;
+        return EMPTY;
+      case 1:
+        final Bits onlySegLiveDocs = 
leafContexts.get(0).reader().getLiveDocs();
+        final FixedBitSet fbs;
+        if (onlySegLiveDocs == null) {
+          // `LeafReader.getLiveDocs()` returns null if no deleted docs -- 
accordingly, set all bits
+          final int onlySegMaxDoc = maxDoc();
+          fbs = new FixedBitSet(onlySegMaxDoc);
+          fbs.set(0, onlySegMaxDoc);
+        } else {
+          fbs = FixedBitSet.copyOf(onlySegLiveDocs);
+        }
+        assert fbs.cardinality() == numDocs();
+        return new BitDocSet(fbs, numDocs());
+      default:
+        final FixedBitSet bs = new FixedBitSet(maxDoc());
+        for (LeafReaderContext ctx : leafContexts) {
+          final LeafReader r = ctx.reader();
+          final Bits segLiveDocs = r.getLiveDocs();
+          final int segDocBase = ctx.docBase;
+          if (segLiveDocs == null) {
+            // `LeafReader.getLiveDocs()` returns null if no deleted docs -- 
accordingly, set all
+            // bits in seg range
+            bs.set(segDocBase, segDocBase + r.maxDoc());
+          } else {
+            DocSetUtil.copyTo(segLiveDocs, 0, r.maxDoc(), bs, segDocBase);
+          }
+        }
+        assert bs.cardinality() == numDocs();
+        return new BitDocSet(bs, numDocs());
+    }
+  }
+
+  private BitDocSet populateLiveDocs(Supplier<BitDocSet> liveDocsSupplier) {
+    final BitDocSet inlineDocs;

Review comment:
       As I suggested in my comment above, this can be simplified with merely 
an Object used to synchronize.  No need for a "inlineDocs" local variable.  No 
need for an array or anything similar-is.  Simplicity!

##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -968,19 +1050,23 @@ public Bits getLiveDocsBits() throws IOException {
   }
 
   /**
+   * If some process external to {@link SolrIndexSearcher} has produced a 
DocSet whose cardinality
+   * matches that of `liveDocs`, this method provides such caller the ability 
to offer its own
+   * DocSet to be cached in the searcher. The caller should then use the 
returned value (which may
+   * or may not be derived from the DocSet instance supplied), allowing more 
efficient memory use.
+   *
    * @lucene.internal
    */
-  public boolean isLiveDocsInstantiated() {
-    return liveDocs != null;
-  }
-
-  /**
-   * @lucene.internal
-   */
-  public void setLiveDocs(DocSet docs) {
+  public BitDocSet offerLiveDocs(Supplier<DocSet> docSetSupplier, int 
suppliedSize) {
+    assert suppliedSize == numDocs();
+    BitDocSet ret = liveDocs;
+    if (ret != null) {
+      liveDocsNaiveCacheHitCount.increment();
+      return ret;
+    }
     // a few places currently expect BitDocSet
-    assert docs.size() == numDocs();
-    this.liveDocs = makeBitDocSet(docs);
+    ret = populateLiveDocs(() -> makeBitDocSet(docSetSupplier.get()));

Review comment:
       pointless; I'd do this as one line -- just return it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[GitHub] [solr] dsmiley commented on a change in pull request #592: SOLR-14765: Optimize DocList creation by skipping sort for sort-irrelevant cases

Reply via email to