Re: updating index

no spam Sat, 24 Feb 2007 19:29:27 -0800

I didn't fully understand your last post and why I wanted to do
IndexReader.terms() then IndexReader.termDocs().  Won't something like this
work?


       for (Business biz : updates)
       {
           Term t = new Term("id", biz.getId()+"");
           TermDocs tDocs = reader.termDocs(t);

           while (tDocs.next())
           {
               Document doc = reader.document(tDocs.doc());
           }
       }

But tDocs never contains any docs.   Is this because I've indexed my pk like
this:

doc.add(new Field("id", biz.getId(), Field.Store.YES, Field.Index.NO));

instead of

doc.add(new Field("id", biz.getId(), Field.Store.YES,
Field.Index.UNTOKENIZED));

Mark

On 2/21/07, Erick Erickson <[EMAIL PROTECTED]> wrote:


I think you can get MUCH better efficiency by using TermEnum/TermDocs. But
I
think you need to index (UN_TOKENIZED) your primary key (although now I'm
not sure. But I'd be surprised if TermEnum worked with un-indexed data.
Still, it'd be worth trying but I've always assumed that TermEnums only
worked on indexed fields....).....

Anyway, your loop looks more like this...

TermEnum terms = IndexReader.terms(new Term("primarykey", ""));
TermDocs tDocs = IndexRreader.termDocs();

while (terms.next()) {
   if (docsToUpdate.contains(terms.text()) {
       tDocs.seek(terms.term());
       writer.updateDocument(tDocs.doc());
   }
}

NOTE: I've been fast and loose with edge conditions, like insuring that
while (terms.next()) doesn't skip the first term, so caveat emptor....
This
loop also assumes that there is one and only one document in your index
with
the primary key. Otherwise, you have to do some more work with the
TermDocs
class to process each document that has your primary key...

This is similar to creating Lucene filters, which is very fast....

Hope this helps
Erick

Re: updating index

Reply via email to