Re: Obtaining the (indexed) terms in a field in a particular document

Erick Erickson Tue, 20 Mar 2007 11:08:32 -0800

Sorry, but you have to have the Lucene document ID, which you
can get either as part of a Hits or HitCollector or...
or by using TermDocs/TermEnum on your unique id (my_id in
your example).


Erick

On 3/20/07, Erick Erickson <[EMAIL PROTECTED]> wrote:


You can do a document.get(field), *assuming* you have stored the data
(Field.Store.YES) at index time, although you may not get
stop words.

On 3/20/07, Donna L Gresh <[EMAIL PROTECTED]> wrote:
>
> My apologies if this is a simple question--
>
> How can I get all the (stemmed and stop words removed, etc.) terms in a
> particular field of a particular document?
>
> Suppose my documents each consist of two fields, one with the name
> "my_id"
> and a unique identifier, and the other being some text string consisting
> of a number of words.
> I'd like to get all the terms in the text string given the unique
> identifier.
>
> (My basic reason is to do a sort of document similarity between the text
>
> string and some other text string, doing a boolean query with
> a number of SHOULD clauses, if this makes sense; I'm welcome to
> suggestions of better ways to do this)
>
> Donna L. Gresh
>

Re: Obtaining the (indexed) terms in a field in a particular document

Reply via email to