javanna commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1697317194
##########
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##########
@@ -328,42 +336,65 @@ protected LeafSlice[] slices(List<LeafReaderContext>
leaves) {
/** Static method to segregate LeafReaderContexts amongst multiple slices */
public static LeafSlice[] slices(
List<LeafReaderContext> leaves, int maxDocsPerSlice, int
maxSegmentsPerSlice) {
+
+ // TODO this is a temporary hack to force testing against multiple leaf
reader context slices.
+ // It must be reverted before merging.
+ maxDocsPerSlice = 1;
+ maxSegmentsPerSlice = 1;
+ // end hack
+
// Make a copy so we can sort:
List<LeafReaderContext> sortedLeaves = new ArrayList<>(leaves);
// Sort by maxDoc, descending:
- Collections.sort(
- sortedLeaves, Collections.reverseOrder(Comparator.comparingInt(l ->
l.reader().maxDoc())));
+ sortedLeaves.sort(Collections.reverseOrder(Comparator.comparingInt(l ->
l.reader().maxDoc())));
- final List<List<LeafReaderContext>> groupedLeaves = new ArrayList<>();
- long docSum = 0;
- List<LeafReaderContext> group = null;
+ final List<List<LeafReaderContextPartition>> groupedLeafPartitions = new
ArrayList<>();
+ int currentSliceNumDocs = 0;
+ List<LeafReaderContextPartition> group = null;
for (LeafReaderContext ctx : sortedLeaves) {
if (ctx.reader().maxDoc() > maxDocsPerSlice) {
assert group == null;
- groupedLeaves.add(Collections.singletonList(ctx));
+ // if the segment does not fit in a single slice, we split it in
multiple partitions of
Review Comment:
Going back to this, especially on @stefanvodita 's comment above, it may
make sense to introduce a different way to control the number of slices,
instead of using the existing maxDocsPerSlice argument. It is kind of nice that
we can apply the existing value and partition segments accordingly, but that
can also be surprising and generate way too many partitions perhaps. Good thing
to get further feedback on.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]