[
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844848#comment-16844848
]
Atri Sharma commented on LUCENE-8757:
-------------------------------------
[~jpountz] Essentially, the idea is to maintain the previous leaf's maxDoc
outside the scope of per leaf collector and move it to AssertingCollector's
state, right?
If I understood you correctly, attached patch should fix this. I verified that
the test the previous iteration added specifically for the out of order docIDs
catches this issue, but agree that AssertingCollector should have the right
assertions in place.
{quote}Looking at the AssertingCollector again, it has a check that doc IDs are
collected in doc ID order, so I wonder why this assertion didn't trip with the
earlier version of your patch that sorted leaves by decreasing maxDoc. Maybe we
just got lucky?
{quote}
Do you think similar assertions/checks would make sense in IndexSearcher too?
If AssertingCollector missed this issue, maybe we should make IndexSearcher's
input arguments validation more robust as well. WDYT?
> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
> Key: LUCENE-8757
> URL: https://issues.apache.org/jira/browse/LUCENE-8757
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Atri Sharma
> Assignee: Simon Willnauer
> Priority: Major
> Attachments: LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch,
> LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch,
> LUCENE-8757.patch, LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one
> thread per segment. This is detrimental to performance in case of skew in
> segment sizes since small segments also get their dedicated thread. This can
> lead to performance degradation due to context switching overheads.
>
> A better algorithm which is cognizant of size skew would have better
> performance for realistic scenarios
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]