[
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832382#comment-16832382
]
Atri Sharma commented on LUCENE-8757:
-------------------------------------
[~simonw] Attached is an updated patch.
My two cents are that segregating segments to keep the document count fair is a
more complex operation that what the slices API does today (and in this patch).
Fair segmentation is a known hard problem (integer partitioning, for eg).
We should also consider how much of a bootstrap time latency would a more
complex algorithm add. Given that a user has the option of overriding
IndexSearcher to add their own ways of splicing, I feel our default algorithm
should do well on the common usecase, but not more than that.
Happy to discuss the alternatives.
> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
> Key: LUCENE-8757
> URL: https://issues.apache.org/jira/browse/LUCENE-8757
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Atri Sharma
> Priority: Major
> Attachments: LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one
> thread per segment. This is detrimental to performance in case of skew in
> segment sizes since small segments also get their dedicated thread. This can
> lead to performance degradation due to context switching overheads.
>
> A better algorithm which is cognizant of size skew would have better
> performance for realistic scenarios
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]