Re: a faster way to addDocument and get the ID just added?

Trejkaz Wed, 30 Mar 2011 22:35:07 -0700

On Wed, Mar 30, 2011 at 8:21 PM, Simon Willnauer
<simon.willna...@googlemail.com> wrote:
> Before trunk (and I think
> its in 3.1 also) merge only merged continuous segments so the actual
> per-segment ID might change but the global document ID doesn't if you
> only add documents. But this should not be considered a feature. In
> upcoming version this does not work anymore since merges can now be
> non-continuous.


This myth was busted some time ago:
https://issues.apache.org/jira/browse/LUCENE-2506?#comment-12935973

Summary: selecting segments to merge is decided by MergePolicy, and a
MergePolicy which does not upset ordering will be remain in existence.

> Anyway, I strongly discourage to rely on lucene document IDs you
> should not do this at all. Can't you use your own ID mechanism?

This has pretty much already been covered in my reply to the previous
person that suggested that solution, not to mention in the initial
email which started the thread.

Summary: the overheads are simply not acceptable.

So far the only remotely helpful suggestion I have heard anywhere is
to keep two gigantic int[] arrays in memory, mapping the IDs in each
direction.  This would work if we had an infinite amount of memory to
play with, but unfortunately we don't.  1 billion item indexes are
expected to work, and we can't just tell everyone to buy 8 GB more RAM
when we update to the next version of our app.  If we were a
server-side app, *maybe* we could...

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: a faster way to addDocument and get the ID just added?

Reply via email to