I am creating several temporary batches of indexes to separate indices and periodically will merge those batches to a set of master indices. I'm using IndexWriter#addIndexesNoOptimise(), but problem that gives me is that the master may already contain the index for that document and I get a duplicate.

Duplicates are prevented in the temporary index, because when adding Documents, I call IndexWriter#deleteDocuments(Term) with my UID, before I add the Document.

I have two choices

a) merge indexes then clean up any duplicates in the master (or vice versa). Probably IndexWriter.deleteDocuments(Term[]) would suit here with all the UIDs of the incoming documents.

b) iterate through the Documents in the temporary index and add them to the 
master

b sounds worse as it seems an IndexWriter's Analyzer cannot be null and I guess there's a penalty in assembling the Document from the reader.

Any views?
Antony







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to