[
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834767#comment-16834767
]
Simon Willnauer commented on LUCENE-8757:
-----------------------------------------
[~atris] I think the assertion in this part doesn't hold:
{code}
+ for (LeafReaderContext ctx : sortedLeaves) {
+ if (ctx.reader().maxDoc() > maxDocsPerSlice) {
+ assert group == null;
+ List singleSegmentSlice = new ArrayList();
{code}
if the previous segment was smallish then _group_ is non-null? I think you
should test these cases, maybe add a random test and randomize the order or the
segments?
This:
{code}
+ List singleSegmentSlice = new ArrayList();
+
+ singleSegmentSlice.add(ctx);
+ groupedLeaves.add(singleSegmentSlice);
{code}
can and should be replaced by:
{code}
groupedLeaves.add(Collections.singletonList(ctx));
{code}
otherwise it looks good.
> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
> Key: LUCENE-8757
> URL: https://issues.apache.org/jira/browse/LUCENE-8757
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Atri Sharma
> Priority: Major
> Attachments: LUCENE-8757.patch, LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one
> thread per segment. This is detrimental to performance in case of skew in
> segment sizes since small segments also get their dedicated thread. This can
> lead to performance degradation due to context switching overheads.
>
> A better algorithm which is cognizant of size skew would have better
> performance for realistic scenarios
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]