Location of code which determines a Hit for PhraseQuery

Sean O'Connor Wed, 07 Sep 2005 20:16:11 -0700

Hi,

I am trying to work through the Hit collection process for aPhraseQuery (using an exact phrase). For an example search, say I'mlooking for:

"lucene action"  (quotes indicating exact phrase)


in a one doc, one field index consisting of:
wow, lucene rocks, lucene action items are cool, very action packed


So my questions are:
- What does (Default)Similarity.idf() do?
   Just create a base frequency for a term in the query?

i.e. it creates something of a raw score, based on term (not phrase)frequency in a doc


- Does (Phrase)Weight do the work of finding a phrase match?
   I don't think so, but get confused in the weight/scorer functionality

-Does ExactPhraseScorer do the work of finding a phrase match?
   I think so, but seem to keep missing where

i.e. where does the first instance of "lucene" (lucene rocks) getskipped but the second one (lucene action) become a hit?


- What does SegmentTermPositions do?

I think this is critical to the PhrasePositions process, whichPhraseScorer usesI think it hold pointers to the positions in the index file streams(ability to read the indexes?)

- Where does the code identify a 'proper' hit for something like anexact phrase?

Apologies in advance for the poor sample text above, and therepetition in question matter. Hopefully I am getting closer to gettingmy head wrapped around the query/hit process (and then work on extendingthe hits to include offset position).

Thanks

Sean



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Location of code which determines a Hit for PhraseQuery

Reply via email to