[
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835481#comment-16835481
]
Simon Willnauer commented on LUCENE-8757:
-----------------------------------------
Thanks for the additional iteration, now that we simplified this can we remove
the sorting? I don't necessearily see how the sort makes things simpler. If we
see a segment > threshold we can just add it as a group? I though you did that
already and hence my comment about the assertion. WDYT?
I also want to suggest to beef up testing a bit with a randomized version of
this like this:
{code}
diff --git
a/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java
b/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java
index 7c63a817adb..76ccca64ee7 100644
--- a/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java
+++ b/lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java
@@ -1933,6 +1933,14 @@ public abstract class LuceneTestCase extends Assert {
ret = random.nextBoolean()
? new AssertingIndexSearcher(random, r, ex)
: new AssertingIndexSearcher(random, r.getContext(), ex);
+ } else if (random.nextBoolean()) {
+ int maxDocPerSlice = 1 + random.nextInt(100000);
+ ret = new IndexSearcher(r, ex) {
+ @Override
+ protected LeafSlice[] slices(List<LeafReaderContext> leaves) {
+ return slices(leaves, maxDocPerSlice);
+ }
+ };
} else {
ret = random.nextBoolean()
? new IndexSearcher(r, ex)
{code}
> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
> Key: LUCENE-8757
> URL: https://issues.apache.org/jira/browse/LUCENE-8757
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Atri Sharma
> Priority: Major
> Attachments: LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one
> thread per segment. This is detrimental to performance in case of skew in
> segment sizes since small segments also get their dedicated thread. This can
> lead to performance degradation due to context switching overheads.
>
> A better algorithm which is cognizant of size skew would have better
> performance for realistic scenarios
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]