I think you can get MUCH better efficiency by using TermEnum/TermDocs. But I think you need to index (UN_TOKENIZED) your primary key (although now I'm not sure. But I'd be surprised if TermEnum worked with un-indexed data. Still, it'd be worth trying but I've always assumed that TermEnums only worked on indexed fields....).....
Anyway, your loop looks more like this... TermEnum terms = IndexReader.terms(new Term("primarykey", "")); TermDocs tDocs = IndexRreader.termDocs(); while (terms.next()) { if (docsToUpdate.contains(terms.text()) { tDocs.seek(terms.term()); writer.updateDocument(tDocs.doc()); } } NOTE: I've been fast and loose with edge conditions, like insuring that while (terms.next()) doesn't skip the first term, so caveat emptor.... This loop also assumes that there is one and only one document in your index with the primary key. Otherwise, you have to do some more work with the TermDocs class to process each document that has your primary key... This is similar to creating Lucene filters, which is very fast.... Hope this helps Erick On 2/21/07, no spam <[EMAIL PROTECTED]> wrote:
I have an index where I'm storing the primary key of my database record as an unindexed field. Nightly I want to update my search index with any database changes / additions. I don't really see an efficient way to update these records besides doing something like this which I'm worried with thrash the index. Is this approach good/bad/ugly? Thanks, Mark IndexReader reader; ArrayList docsToUpdate; for (int i = 0; i < reader.maxDoc(); i++) { Document doc = reader.document(i); if (doc != null) { String prinaryKey = doc.getField("id"); if (docsToUpdate.contains(primaryKey)) { // set fields writer.updateDocument(doc); } } // for all docs not found in index for (DBObject o : docsToUpdate) { if (o.syncedWithIndex() == false) { // create new doc Document doc = ....; // this is a new doc writer.addDocument(doc); } }