Hi, I've been thinking about how to update a single field of a document without touching its other fields. This is an old problem and I was considering a solution along the lines of Andrzej Bialecki's post to the dev list back in '07:
<quote http://markmail.org/message/tbkgmnilhvrt6bii > I have the following scenario: I want to use ParallelReader to maintain parts of the index that are changing quickly, and where changes are limited to specific fields only. Let's say I have a "main" index (many fields, slowly changing, large updates), and an "aux" index (fast changing, usually single doc and single field updates). I'd like to "replace" documents in the "aux" index - that is, delete one doc and add another - but in a way that doesn't change the internal document numbers, so that I can keep the mapping required by ParallelReader intact. I think this is possible to achieve by using a FilterIndexReader, which keeps a map of updated documents, and re-maps old doc ids to the new ones on the fly. >From time to time I'd like to optimize the "aux" index to get rid of deleted docs. At this time I need to figure out how to preserve the old->new mapping during the optimization. So, here's the question: is this scenario feasible? If so, then in the trunk/ version of Lucene, is there any way to figure out (predictably) how internal document numbers are reassigned after calling optimize() ? </quote> Reading that trail, I wish the original poster gave up on his idea ( http://markmail.org/message/tbkgmnilhvrt6bii#query:+page:1+mid:kn77zpiu43kd2ufn+state:results ) <quote> Thanks for the input - for now I gave up on this, after discovering that I would have no way to ensure in TermDocs.skipTo() that document id-s are monotonically increasing (which seems to be a part of the contract). </quote> I imagine if Andrzej's proposed FilterIndexReader maintains 2 sorted (ordered) maps, one from internal document-ids to "view" document-ids, and another mapping from "view" document-ids to internal document-ids, then things like skipTo() can be implemented reasonably efficiently. Only the mapped ids are maintained in these structures. (Also note that a mapped "view" document-id represents an internally deleted document with that id.) And if we can find a way to merge the segments of this "aux" index along whenever the segments of its associated "main" index are merged or optimized (such that the [internal] doc-ids in the merged aux index end up getting sync'ed with those of the trunk), then there shouldn't be all that many doc-ids to map anyway (if we merge frequently enough). So to go back Andrzej's question: is there any way to figure out (predictably) how internal document numbers [in the main index] are reassigned after calling optimize() ? How does LUCENE-847, as Doug Cutting suggests in that trail, help? Sorry if that was long winded, had to start somewhere ;) -Babak --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org