I am afraid the DocMap still maintains doc-id mappings till merge and I am trying to avoid it...
I think lucene itself has a MergeIterator in o.a.l.util package. A MergePolicy can wrap a simple MergeIterator for iterating docs across different AtomicReaders in correct sort-order for a given field/term That should be fine right? -- Ravi -- Ravi On Tue, Jun 17, 2014 at 1:24 PM, Shai Erera <ser...@gmail.com> wrote: > loadSortTerm is your method right? In the current Sorter.sort > implementation, I see this code: > > boolean sorted = true; > for (int i = 1; i < maxDoc; ++i) { > if (comparator.compare(i-1, i) > 0) { > sorted = false; > break; > } > } > if (sorted) { > return null; > } > > Perhaps you can write similar code? > > Also note that the sorting interface has changed, I think in 4.8, and now > you don't really need to implement a Sorter, but rather pass a SortField, > if that works for you. > > Shai > > > On Tue, Jun 17, 2014 at 9:41 AM, Ravikumar Govindarajan < > ravikumar.govindara...@gmail.com> wrote: > > > Shai, > > > > This is the code snippet I use inside my class... > > > > public class MySorter extends Sorter { > > > > @Override > > > > public DocMap sort(AtomicReader reader) throws IOException { > > > > final Map<Integer, BytesRef> docVsId = loadSortTerm(reader); > > > > final Sorter.DocComparator comparator = new Sorter.DocComparator() { > > > > @Override > > > > public int compare(int docID1, int docID2) { > > > > BytesRef v1 = docVsId.get(docID1); > > > > BytesRef v2 = docVsId.get(docID2); > > > > return v1.compareTo(v2); > > > > } > > > > }; > > > > return sort(reader.maxDoc(), comparator); > > > > } > > } > > > > My Problem is, the "AtomicReader" passed to Sorter.sort method is > actually > > a SlowCompositeReader, composed of a list of AtomicReaders each of which > is > > already sorted. > > > > I find this "loadSortTerm(compositeReader)" to be a bit heavy where it > > tries to all load the doc-to-term mappings eagerly... > > > > Are there some alternatives for this? > > > > -- > > Ravi > > > > > > On Tue, Jun 17, 2014 at 10:58 AM, Shai Erera <ser...@gmail.com> wrote: > > > > > I'm not sure that I follow ... where do you see DocMap being loaded up > > > front? Specifically, Sorter.sort may return null of the readers are > > already > > > sorted ... I think we already optimized for the case where the readers > > are > > > sorted. > > > > > > Shai > > > > > > > > > On Tue, Jun 17, 2014 at 4:04 AM, Ravikumar Govindarajan < > > > ravikumar.govindara...@gmail.com> wrote: > > > > > > > I am planning to use SortingMergePolicy where all the > > merge-participating > > > > segments are already sorted... I understand that I need to define a > > > DocMap > > > > with old-new doc-id mappings. > > > > > > > > Is it possible to optimize the eager loading of DocMap and make it > kind > > > of > > > > lazy load on-demand? > > > > > > > > Ex: Pass List<AtomicReader> to the caller and ask for next new-old > doc > > > > mapping.. > > > > > > > > Since my segments are already sorted, I could save on memory a > > little-bit > > > > this way, instead of loading the full DocMap upfront > > > > > > > > -- > > > > Ravi > > > > > > > > > >