Getting matched words for PhraseQuery or SpanNearQuery

Jaco Tue, 28 Apr 2009 01:19:28 -0700

Hello,

I am pretty new to the Lucene API, and there's something I can't figure out
from the docs and from the mailing list archives. I hope somebody can point
me into the right direction. Here's my case: for text analysis purposes I am
doing PhraseQueries and SpanNearQueries. Using the highlighter, I can
extract text snippets with matching words marked.


What I really am looking for is to extract information on each match to the
query, if possible including position information in the text. For example,
if the text I am searching in is [a b c a d e f a b], and my query is [a b],
then I want to know where the words [a b] were matched together in the text
due to the use of the PhraseQuery/SpanNearQuery ([a b] will get me two
occurrences in the documents text).

As far as I can find out, the highlighter is capable of marking the
individual words causing the hit, but it can't show me which words together
form one 'hit' to the search text. Is there a way to do this with the Lucene
API? Any help would be appreciated!

Thanks in advance, bye,

Jaco.

PS this is a follow up for this thread in the Solr user mailing list:
http://markmail.org/thread/cokya3rsmzsjocdh

Getting matched words for PhraseQuery or SpanNearQuery

Reply via email to