You mean the total number of edits between those strings must be <= 2? If so, you must index the entire "Lucene Apache Group" as a single token, and likewise do a FuzzyQuery with the entire "Luceni Apachi Group", etc.
If instead you do tokenize and use BooleanQuery to combine them, then that allows <= 2 edits for each term, or more than 2 edits total. Performance is likely fine here; FuzzyQuery is very faster since http://blog.mikemccandless.com/2011/03/lucenes-fuzzyquery-is-100-times-faster.html ... have you tested it? Mike McCandless http://blog.mikemccandless.com On Fri, Oct 21, 2016 at 2:45 PM, Michael Wilkowski <m...@silenteight.com> wrote: > Hi, > I need to implement a function that performs fuzzy search on multiple terms > in the way that a summarized distance 2 from ALL terms is allowed. For > example query: > > Lucene Apache Group > > with maximum distance 2 should match: > > Luceni Apachi Group > Lucen Apache Group > Luce Apache Group > > but not: > > Lucen Apach Grou > > I know that I can achieve it using multiple FuzzyQueries nested with > BooleanQueries, but in case of more terms (>5) and distance of 2 there > could be many many combinations and I am afraid of performance. > > Perhaps there is a better solution that someone may recommend? > > Regards, > Michael --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org