You might want to take a look at the TokenPhraseSuggester in LUCENE-626. It is more or less a FuzzySpanQuery, built from a matrix of tokens, but places one search for each possible query out of the matrix (with some optional parameters to minimze the query) to find a score and the hits for each possible match.

In my test cases this can end up beeing thousands or tens of thousands of queries to find the most plausable spelling. So rather than applying these queries to the main index, one wants to do it against a small a priori index.

http://issues.apache.org/jira/browse/LUCENE-626

TokenPhraseSuggester is quite well documented in code and there is some about it in the package level java docs too:
http://ginandtonique.org/~kalle/javadocs/didyoumean/org/apache/lucene/search/didyoumean/package-summary.html#package_description



7 dec 2007 kl. 21.58 skrev Doron Cohen:

smokey <[EMAIL PROTECTED]> wrote on 04/12/2007 16:54:32:

Thanks for the information on o.a.l.search.spans.

I was thinking of parsing the phrase query string into a
sequence of terms,
then constructing a phrase query object using add(Term term,
int position)
method in org.apache.lucene.search.PhraseQuery class. Then I can inject
similar words (suggested by SpellChecker) at appropriate
positions for each
term as I construct the final phrase query object.

Do you agree that this should work too?

I never tried this but I'm sure it will not work.

The phrase query scorer requires all the terms to
appear - either at the 'right' place or with slop
for sloppy phrases. Therefore if you inject two
terms in the same position the scorer will require
to find both of them in the same position in order to
match a document. This would be an AND logic, while
what you need is an OR logic.

The phrase positions are handy for something else:
supporting searching when stopwords where filtered
(either at indexing or at search or both).

Regards,
Doron


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to