You mean, for each doc in the topN you want to be able to find out
which terms caused it to match?

This is frequently requested feature (I think there was another thread
just recently).

But unfortunately there's not really a good/simple way today (I
think?).  Someone should at least start a wiki page where we gather
the possible approaches along with their limitations (term vectors,
explain, highlighter).

Yet, Lucene clearly knows this information during scoring.  Eg scoring
of an OR'd set of terms goes and visits each TermQuery that matched
the doc, summing up its score contribution.  The problem is we make no
effort to save which sub-query had matched, per doc.

I think a good approach would be to add a standard method to Scorers,
to return to you some details about the match.  The API would be alot
like explain(), but far more efficient since you would call it during
scoring when a document has just matched.  It would also be very
similar to what's being discussed in LUCENE-1522 (adding methode to
scorer API to return all positions of all term matches).

This would make a wonderful contribution if someone has itch + time.

Mike

Wouter Heijke wrote:

I want to know for each term in a query if it matched the result or not.
What is the best way to implement this?
Highlighter seems to be able to do the trick only that I don't need to
'highlight' any text. After knowing if terms in the query matched I want
do do something else based on this.

Wouter


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to