[ 
https://issues.apache.org/jira/browse/SOLR-17310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854761#comment-17854761
 ] 

Christine Poerschke commented on SOLR-17310:
--------------------------------------------

Following on from my 
[https://github.com/apache/solr/pull/2477/files#r1638276023] comment:
{quote}... Leaf sorting is "between segment sorting" and we also have index 
sorting i.e. "within segment sorting" – I wonder if there might be enough 
commonality to generalise. ...
{quote}
Perhaps something like
{code:java}
public abstract class IndexSorters {
  public abstract Comparator<LeafReader> getLeafSorter(); // via SOLR-17310 here
  public abstract Sort getIndexSort(); // via SOLR-13681 i.e. not here
}
{code}
looking something like this in configuration (with both elements optional)
{code:java}
<indexSorters class="org.apache.solr.index.DefaultIndexSorters">
  <str name="betweenSegmentSort">timestamp_i_dvo desc</str>
  <str name="withinSegmentSort">timestamp_i_dvo desc</str>
</indexSorters>
{code}
e.g. similar to 
[https://solr.apache.org/guide/solr/latest/configuration-guide/index-segments-merging.html#mergepolicyfactory]
 and something like
{code:java}
public class DefaultIndexSorters extends IndexSorters {

  public abstract Comparator<LeafReader> getLeafSorter() {
    if (betweenSegmentSort != null) {
      final Sort sort = SortSpecParsing.parseSortSpec(betweenSegmentSort, 
schema).getSort();
      // check that sort contains only one field and that it's of suitable type 
     
      // construct and return comparator similar to 
https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/core/src/test/org/apache/lucene/index/TestIndexWriterReader.java#L1217-L1237
    }
    return null;
  }

  public abstract Sort getIndexSort() {
    if (withinSegmentSort != null) {
      return SortSpecParsing.parseSortSpec(withinSegmentSort, schema).getSort();
    } else {
      return null;
    }
  }

}
{code}
as a default implementation.

Or perhaps something other than {{<str 
name="betweenSegmentSort">timestamp_i_dvo desc</str>}} would be a more 
generally meaningful default implementation?

Whatever the default implementation, the {{<indexSorters 
class="org.apache.solr.index.DefaultIndexSorters">}} class attribute would 
allow for custom sorters too.

> Configurable LeafSorter to customize segment search order
> ---------------------------------------------------------
>
>                 Key: SOLR-17310
>                 URL: https://issues.apache.org/jira/browse/SOLR-17310
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: wei wang
>            Priority: Minor
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Lucene's IndexWriterConfig provides the option to sort leaf readers when a 
> custom LeafSorter is provided.   
> [https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java#L494]
>  
> The functionality is currently not directly exposed in Solr. One use case is 
> in early termination,  we would like to search the more recent updated 
> segments first.  The SegmentTimeLeafSorter sorts the LeafReaders by time 
> stamp,  so that recent NRT segments can be traversed first.  The feature is 
> enabled by adding the *segmentSort* config in solrconfig.xml.  Without the 
> config, no sorting is applied by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to