Re: Merging ordered segments without re-sorting.

2013-10-24 Thread Adrien Grand
Hi, On Thu, Oct 24, 2013 at 12:20 AM, Arvind Kalyan wrote: > I will benchmark the available approach itself then, in that case. Will > revert back if the performance in unacceptable. For the record, last time I checked, indexing was 2x slower on average on a 10M document collection (see https://

Re: Merging ordered segments without re-sorting.

2013-10-23 Thread Arvind Kalyan
On Wed, Oct 23, 2013 at 2:45 PM, Adrien Grand wrote: > Hi, > > On Wed, Oct 23, 2013 at 10:19 PM, Arvind Kalyan wrote: > > Sorting is not an option for our case so we will most likely implement a > > variant that merges the segments in one pass. Using TimSort is great but > in > > our case the 2

Re: Merging ordered segments without re-sorting.

2013-10-23 Thread Adrien Grand
Hi, On Wed, Oct 23, 2013 at 10:19 PM, Arvind Kalyan wrote: > Sorting is not an option for our case so we will most likely implement a > variant that merges the segments in one pass. Using TimSort is great but in > our case the 2 segments will be highly interspersed and would not benefit > from th

Re: Merging ordered segments without re-sorting.

2013-10-23 Thread Arvind Kalyan
Thanks again. Sorting is not an option for our case so we will most likely implement a variant that merges the segments in one pass. Using TimSort is great but in our case the 2 segments will be highly interspersed and would not benefit from the galloping in TimSort. In additional, if anyone else

Re: Merging ordered segments without re-sorting.

2013-10-23 Thread Shai Erera
SortingAtomicReader uses the TimSort algorithm, which performs well when the two segments are already sorted. Anyway, that's the way to do it, even if it looks like it does more work than it should. Shai On Wed, Oct 23, 2013 at 10:46 PM, Arvind Kalyan wrote: > Thanks, my understanding is that

Re: Merging ordered segments without re-sorting.

2013-10-23 Thread Arvind Kalyan
Thanks, my understanding is that SortingMergePolicy performs sorting after wrapping the 2 segments, correct? As I mentioned in my original email I would like to avoid the re-sorting and exploit the fact that the input segments are already sorted. On Wed, Oct 23, 2013 at 11:02 AM, Shai Erera wr

Re: Merging ordered segments without re-sorting.

2013-10-23 Thread Shai Erera
Hi You can use SortingMergePolicy and SortingAtomicReader to achieve that. You can read more about index sorting here: http://shaierera.blogspot.com/2013/04/index-sorting-with-lucene.html Shai On Wed, Oct 23, 2013 at 8:13 PM, Arvind Kalyan wrote: > Hi there, I'm looking for pointers, suggesti

Merging ordered segments without re-sorting.

2013-10-23 Thread Arvind Kalyan
Hi there, I'm looking for pointers, suggestions on how to approach this in Lucene 4.5. Say I am creating an index using a sequence of addDocument() calls and end up with segments that each contain documents in a specified ordering. It is guaranteed that there won't be updates/deletes/reads etc hap