Search Hit frequency and location

Sean O'Connor Thu, 16 Jun 2005 09:04:04 -0700

Hello,
    I am trying to find the right approach for finding frequency (and,
slightly lower in priority, location) of search hits in a document. I
am working through the online documentation and the helpful "Lucene in
Action" book. There are several examples and explanations which seem
close, but not quite what I am looking for. Can anyone point me in the
right direction?


    I have a set of queries, say 10000 different 'things' I want to
find. They range from single word matches (e.g. Lucene) to prefix
queries (e.g. index*) to phrases (e.g. "Lucene in Action", stopword
irrelevant, so "Lucene Action"?). I will also be delving into the more
advanced topics like proximity, fuzzy, snowball and such. For the
moment though, I will stick with the first three I mention, which I
believe translate to: TermQuery, PrefixQuery, and PhraseQuery.

    How do I find how many hits occur in a document? I've seen the faq:
Is there a way to retrieve the original term positions during the search?

Yes, see the Javadoc for IndexReader.termPositions().
    I'm probably missing the obvious here, but I assume this refers to
the analyzed terms (i.e. individual words, possibly transmogrified by
the analyzer).
I further assume that this does not directly relate to the results of
a search for "Lucene in Action". Where do I find information about the
search hits? Have I skimmed over this part of the API?
Thanks in advance,

Sean

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Search Hit frequency and location

Reply via email to