I am creating several temporary batches of indexes to separate indices and
periodically will merge those batches to a set of master indices. I'm using
IndexWriter#addIndexesNoOptimise(), but problem that gives me is that the master
may already contain the index for that document and I get a duplicate.
Duplicates are prevented in the temporary index, because when adding Documents,
I call IndexWriter#deleteDocuments(Term) with my UID, before I add the Document.
I have two choices
a) merge indexes then clean up any duplicates in the master (or vice versa).
Probably IndexWriter.deleteDocuments(Term[]) would suit here with all the UIDs
of the incoming documents.
b) iterate through the Documents in the temporary index and add them to the
master
b sounds worse as it seems an IndexWriter's Analyzer cannot be null and I guess
there's a penalty in assembling the Document from the reader.
Any views?
Antony
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]