Not quite sure what I'd call it. For the same index (i.e. no merging) it's completely stable. But when merges happen, as you see, the final sort order can change due to reassigning the internal Lucene ID.
Furthermore, the _scores_ can change due to merging because the merge operation can change the tf/idf numbers. Consider the pathological situation of two docs in my corpus of 100 docs. doc1.title = "my dog has fleas". Doc2.title="fleas..." where "fleas" is repeated 100 times. No other doc has "fleas" in the title. Now doc2 is deleted (but not merged away) and I search for title:(fleas OR [word appears in 10 other docs' titles]). doc1 will probably be at the bottom of the list. Now I forceMerge and doc1 will appear at the top all other things being equal. So not sure how I'd characterize all that... FWIW, Erick On Sun, Nov 16, 2014 at 8:56 AM, Ahmet Arslan (JIRA) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/LUCENE-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213961#comment-14213961 > ] > > Ahmet Arslan commented on LUCENE-6057: > -------------------------------------- > > Thanks Martin, for clarifying this! I has some cases where multiple documents > get assigned to same score. Order of the documents was changing from > core/index to core/index. So I thought sorting algorithm lucene use, is not > stable (like heap sort, selection sort etc.). But it looks like thats not > the case? When default sort is used (sort by score/relevancy) internal lucene > ids are used to break tie. And those ids cange during segment merge etc. Is > it wrong to say that lucene uses a non-stable sort? > >> Clarify the Sort(SortField...) constructor) >> ------------------------------------------- >> >> Key: LUCENE-6057 >> URL: https://issues.apache.org/jira/browse/LUCENE-6057 >> Project: Lucene - Core >> Issue Type: Improvement >> Components: core/search >> Affects Versions: 4.10.2, Trunk >> Reporter: Martin Braun >> Assignee: Michael McCandless >> Priority: Minor >> Labels: Clarification, Documentation, New_Users, Sort >> Fix For: 4.10.2, 5.0, Trunk >> >> Attachments: LUCENE-6057.patch >> >> >> I don't really know which version this affects, but I clarified the >> documentation of the Sort(SortField...) constructor to ease the >> understanding for new users. >> Pull Request: >> https://github.com/apache/lucene-solr/pull/20 > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
