[
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838363#comment-16838363
]
Adrien Grand commented on LUCENE-8757:
--------------------------------------
Yes. Top-docs collectors are expected to tie-break by doc ID in case documents
compare equal. Things like TopDocs#merge compare doc IDs explicitly for that
purpose, but Collector#collect implementations just rely on the fact that
documents are collected in order to ignore documents that compare equal to the
current k-th best hit. So we need to sort segments within a slice by docBase in
order to get the same top hits regardless of how slices have been constructed.
> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
> Key: LUCENE-8757
> URL: https://issues.apache.org/jira/browse/LUCENE-8757
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Atri Sharma
> Assignee: Simon Willnauer
> Priority: Major
> Attachments: LUCENE-8757.patch, LUCENE-8757.patch, LUCENE-8757.patch,
> LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one
> thread per segment. This is detrimental to performance in case of skew in
> segment sizes since small segments also get their dedicated thread. This can
> lead to performance degradation due to context switching overheads.
>
> A better algorithm which is cognizant of size skew would have better
> performance for realistic scenarios
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]