PhraseQuery doc

2021-12-13 Thread Claude Lepere
https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/PhraseQuery.html#getSlop--, writes the following: For instance, when searching for "quick fox", it is expected that the difference between the positions of fox and quick is 1. So "a quick brown fox" would be at an edit distance of

Re: A question on PhraseQuery and slop

2021-12-13 Thread Michael Sokolov
I wonder if the Analysis chain could be involved. If those stop words ("is") are removed without leaving a hole somehow, then that could explain? On Mon, Dec 13, 2021 at 9:35 AM Michael McCandless wrote: > > Hello Claude, > > Hmm, that is interesting that you see slop=2 matching query "quick fox"

Lucene 9.0.0 inconsistent index options

2021-12-13 Thread Ian Lea
Hi We have a long-standing index with some mandatory fields and some optional fields that has been through multiple lucene upgrades without a full rebuild and on testing out an upgrade from version 8.11.0 to 9.0.0, when open an IndexWriter we are hitting the exception Exception in thread "main"

Re: A question on PhraseQuery and slop

2021-12-13 Thread Michael McCandless
Hello Claude, Hmm, that is interesting that you see slop=2 matching query "quick fox" against document "the fox is quick". Edit distance (Levenshtein) is a bit tricky because it might include a transposition (just swapping the two words) as edit distance 1 OR 2. So maybe Lucene's PhraseQuery is