I suspect app-controlled docID will be a challenge, but I haven't thought it through much.
One possible solution might be to use joins? Either index time or query time.... Ie, make a document that has the big text field that never change, and a separate document that has all the little fields that frequently change, joined by a common field. Then you can freely update the little fields without changing the big field. Mike McCandless http://blog.mikemccandless.com On Thu, Oct 25, 2012 at 6:10 AM, Ravikumar Govindarajan <ravikumar.govindara...@gmail.com> wrote: > We have the need to re-index some fields in our application frequently. > > Our typical document consists of > > a) Many single-valued {long/int} re-indexable fields > b) Few large-valued {text/string} static fields > > We have to re-index an entire document if a single smallish field changes > and it is turning out to be a problem for us. I have gone through the > https://issues.apache.org/jira/browse/LUCENE-3837 proposal where it tries > to work-around this limitation using a secondary mapping of new-old docids. > > As I understand, lucene strictly maintains internal doc-id order so that > many queries that depend on it, will work correctly. Segment merges will > also maintain order as well as reclaim deleted doc-ids > > There should be many applications like us, which manage index shards > limiting a given shard based on doc-id limits or size. So reclaiming > deleted doc-ids is mostly a non-issue for us. > > That leaves us with changing doc-ids. How about leaving open the doc-ids > themselves to the applications, at-least as an option to the needy? Taking > such an approach might inter-leave doc-ids across segments, but within a > segment, the docIds are always in increasing order. There are possibilities > of ghost-deletes, duplicate docIds etc..., but all should be solvable, I > believe. > > Fronting these doc-ids during search from all segment readers and returning > the correct value from one of them should be easy. Will it incur a heavy > penalty during search? Another advantage gained, is the triviality of > cross-joining indexes when docIDs are fixed. > > There must be many other places where an app supplied docId might make > lucene behave funny. Need some help in identifying those areas at least for > understanding this problem correctly, if not solving it all together. > > -- > Ravi --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org