Re: How to not overwrite a Document if it 'already exists'?

Antony Bowesman Tue, 05 May 2009 18:34:00 -0700

Thanks for that info.  These indexes will be large, in the 10s of millions.
 id field is unique and is 29 bytes.  I guess that's still a lot of data to
trawl through to get to the term.


Have you tested how long it takes to look up docs from your id?

Not in indexes that size in a live environment as I don't have the hardware tomake those sorts of test :( although I know in general, lookup is fast.

Couldn't you just give the base & full docs different ids?  Then you
can independently choose which one to update?

I considered that, but as the normal case will not need to worry about thisscenario.

There is only ever one instance of a mail Doc, whether it is a root mail or partof a forward chain and a root mail can of course be part of a forward chain atsome point, so it should be optimal to just fetch the one Document for the mailId without first trying the true Id, then some pseudo Id if it isn't found.

Unfortunately, I'm having to solve this problem in my Lucene app as the toolthat's generating this data is unable to know what has or has not been handledpreviously.

I'm implementing it using the IndexReader approach for now and will try to getsome performance data, so thanks for your comments Mike.


Antony








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: How to not overwrite a Document if it 'already exists'?

Reply via email to