Re: [PR] Speed up exhaustive evaluation. [lucene]

via GitHub Wed, 21 May 2025 01:52:06 -0700


jpountz commented on code in PR #14679:
URL: https://github.com/apache/lucene/pull/14679#discussion_r2099741906



##########
lucene/core/src/java/org/apache/lucene/search/Scorer.java:
##########
@@ -76,4 +77,57 @@ public int advanceShallow(int target) throws IOException {
    * {@link #advanceShallow(int) shallow-advanced} to included and {@code 
upTo} included.
    */
   public abstract float getMaxScore(int upTo) throws IOException;
+
+  /**
+   * Return a new batch of doc IDs and scores, starting at the current doc ID, 
and ending before
+   * {@code upTo}. Because it starts on the current doc ID, it is illegal to 
call this method if the
+   * {@link #docID() current doc ID} is {@code -1}.
+   *
+   * <p>An empty return value indicates that there are no postings left 
between the current doc ID
+   * and {@code upTo}.
+   *
+   * <p>Implementations should ideally fill the buffer with a number of 
entries comprised between 8
+   * and a couple hundreds, to keep heap requirements contained, while still 
being large enough to
+   * enable operations on the buffer to auto-vectorize efficiently.
+   *
+   * <p>The default implementation is provided below:
+   *
+   * <pre class="prettyprint">
+   * int batchSize = 16; // arbitrary
+   * buffer.growNoCopy(batchSize);
+   * int size = 0;
+   * DocIdSetIterator iterator = iterator();
+   * for (int doc = docID(); doc &lt; upTo &amp;&amp; size &lt; batchSize; doc 
= iterator.nextDoc()) {
+   *   if (liveDocs == null || liveDocs.get(doc)) {
+   *     buffer.docs[size] = doc;
+   *     buffer.scores[size] = score();
+   *     ++size;
+   *   }
+   * }
+   * buffer.size = size;
+   * </pre>
+   *
+   * <p><b>NOTE</b>: The provided {@link DocAndScoreBuffer} should not hold 
references to internal
+   * data structures.
+   *
+   * <p><b>NOTE</b>: In case this {@link Scorer} exposes a {@link 
#twoPhaseIterator()
+   * TwoPhaseIterator}, it should be positioned on a matching document before 
this method is called.
+   *
+   * @lucene.internal
+   */
+  public void nextDocsAndScores(int upTo, Bits liveDocs, DocAndScoreBuffer 
buffer)
+      throws IOException {
+    int batchSize = 16; // arbitrary
+    buffer.growNoCopy(batchSize);
+    int size = 0;
+    DocIdSetIterator iterator = iterator();

Review Comment:
   Possibly indeed. Let's look into it as a follow-up? I'm not sure if we 
should cache the iterator here or rather fix impls to avoid allocating in 
`#iterator()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Speed up exhaustive evaluation. [lucene]

Reply via email to