Yes, I'm pretty sure you have to index the field (UN_TOKENIZED) to be able
to fetch it with TermDocs/TermEnum! The loop I posted works like this....
for each term in the index for the field
if this is one I want to update
use a TermDocs to get to that document and operate on it.
But this is actually pretty silly. Your loop uses a better approach, except
you're not using TermDocs correctly. Try
TermDocs tDocs = new IndexReader.TermDocs()
for (Business biz : updates)
{
Term t = new Term("id", biz.getId());
tDocs.seek(t);
while (tDocs.next())
{
Document doc = reader.document(tDocs.doc());
}
}
But TermDocs/TermEnum is looking at terms in the index. If you haven't
indexed the term, you won't find it, so your Field.Index.NO is really
hurting you here.
Best
Erick
On 2/24/07, no spam <[EMAIL PROTECTED]> wrote:
I didn't fully understand your last post and why I wanted to do
IndexReader.terms() then IndexReader.termDocs(). Won't something like
this
work?
for (Business biz : updates)
{
Term t = new Term("id", biz.getId()+"");
TermDocs tDocs = reader.termDocs(t);
while (tDocs.next())
{
Document doc = reader.document(tDocs.doc());
}
}
But tDocs never contains any docs. Is this because I've indexed my pk
like
this:
doc.add(new Field("id", biz.getId(), Field.Store.YES, Field.Index.NO));
instead of
doc.add(new Field("id", biz.getId(), Field.Store.YES,
Field.Index.UNTOKENIZED));
Mark
On 2/21/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> I think you can get MUCH better efficiency by using TermEnum/TermDocs.
But
> I
> think you need to index (UN_TOKENIZED) your primary key (although now
I'm
> not sure. But I'd be surprised if TermEnum worked with un-indexed data.
> Still, it'd be worth trying but I've always assumed that TermEnums only
> worked on indexed fields....).....
>
> Anyway, your loop looks more like this...
>
> TermEnum terms = IndexReader.terms(new Term("primarykey", ""));
> TermDocs tDocs = IndexRreader.termDocs();
>
> while (terms.next()) {
> if (docsToUpdate.contains(terms.text()) {
> tDocs.seek(terms.term());
> writer.updateDocument(tDocs.doc());
> }
> }
>
> NOTE: I've been fast and loose with edge conditions, like insuring that
> while (terms.next()) doesn't skip the first term, so caveat emptor....
> This
> loop also assumes that there is one and only one document in your index
> with
> the primary key. Otherwise, you have to do some more work with the
> TermDocs
> class to process each document that has your primary key...
>
> This is similar to creating Lucene filters, which is very fast....
>
> Hope this helps
> Erick
>
>
>
>