[ 
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834767#comment-16834767
 ] 

Simon Willnauer commented on LUCENE-8757:
-----------------------------------------

[~atris] I think the assertion in this part doesn't hold:

{code}
+    for (LeafReaderContext ctx : sortedLeaves) {
+      if (ctx.reader().maxDoc() > maxDocsPerSlice) {
+        assert group == null;
+        List singleSegmentSlice = new ArrayList();
{code}

if the previous segment was smallish then _group_ is non-null? I think you 
should test these cases, maybe add a random test and randomize the order or the 
segments?

This:
{code}
+        List singleSegmentSlice = new ArrayList();
+
+        singleSegmentSlice.add(ctx);
+        groupedLeaves.add(singleSegmentSlice);
{code}
can and should be replaced by:

{code}
groupedLeaves.add(Collections.singletonList(ctx));
{code}


otherwise it looks good.

> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
>                 Key: LUCENE-8757
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8757
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Atri Sharma
>            Priority: Major
>         Attachments: LUCENE-8757.patch, LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one 
> thread per segment. This is detrimental to performance in case of skew in 
> segment sizes since small segments also get their dedicated thread. This can 
> lead to performance degradation due to context switching overheads.
>  
> A better algorithm which is cognizant of size skew would have better 
> performance for realistic scenarios



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to