Hi,
On Thu, Oct 24, 2013 at 12:20 AM, Arvind Kalyan wrote:
> I will benchmark the available approach itself then, in that case. Will
> revert back if the performance in unacceptable.
For the record, last time I checked, indexing was 2x slower on average
on a 10M document collection (see
https://
On Wed, Oct 23, 2013 at 2:45 PM, Adrien Grand wrote:
> Hi,
>
> On Wed, Oct 23, 2013 at 10:19 PM, Arvind Kalyan wrote:
> > Sorting is not an option for our case so we will most likely implement a
> > variant that merges the segments in one pass. Using TimSort is great but
> in
> > our case the 2
Hi,
On Wed, Oct 23, 2013 at 10:19 PM, Arvind Kalyan wrote:
> Sorting is not an option for our case so we will most likely implement a
> variant that merges the segments in one pass. Using TimSort is great but in
> our case the 2 segments will be highly interspersed and would not benefit
> from th
Thanks again.
Sorting is not an option for our case so we will most likely implement a
variant that merges the segments in one pass. Using TimSort is great but in
our case the 2 segments will be highly interspersed and would not benefit
from the galloping in TimSort.
In additional, if anyone else
SortingAtomicReader uses the TimSort algorithm, which performs well when
the two segments are already sorted.
Anyway, that's the way to do it, even if it looks like it does more work
than it should.
Shai
On Wed, Oct 23, 2013 at 10:46 PM, Arvind Kalyan wrote:
> Thanks, my understanding is that
Thanks, my understanding is that SortingMergePolicy performs sorting after
wrapping the 2 segments, correct?
As I mentioned in my original email I would like to avoid the re-sorting
and exploit the fact that the input segments are already sorted.
On Wed, Oct 23, 2013 at 11:02 AM, Shai Erera wr
Hi
You can use SortingMergePolicy and SortingAtomicReader to achieve that. You
can read more about index sorting here:
http://shaierera.blogspot.com/2013/04/index-sorting-with-lucene.html
Shai
On Wed, Oct 23, 2013 at 8:13 PM, Arvind Kalyan wrote:
> Hi there, I'm looking for pointers, suggesti
Hi there, I'm looking for pointers, suggestions on how to approach this in
Lucene 4.5.
Say I am creating an index using a sequence of addDocument() calls and end
up with segments that each contain documents in a specified ordering. It is
guaranteed that there won't be updates/deletes/reads etc hap