https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/PhraseQuery.html#getSlop--,
writes the following:
For instance, when searching for "quick fox", it is expected that the
difference between the positions of fox and quick is 1. So "a quick brown
fox" would be at an edit distance of
I wonder if the Analysis chain could be involved. If those stop words
("is") are removed without leaving a hole somehow, then that could
explain?
On Mon, Dec 13, 2021 at 9:35 AM Michael McCandless
wrote:
>
> Hello Claude,
>
> Hmm, that is interesting that you see slop=2 matching query "quick fox"
Hi
We have a long-standing index with some mandatory fields and some optional
fields that has been through multiple lucene upgrades without a full
rebuild and on testing out an upgrade from version 8.11.0 to 9.0.0, when
open an IndexWriter we are hitting the exception
Exception in thread "main"
Hello Claude,
Hmm, that is interesting that you see slop=2 matching query "quick fox"
against document "the fox is quick".
Edit distance (Levenshtein) is a bit tricky because it might include a
transposition (just swapping the two words) as edit distance 1 OR 2.
So maybe Lucene's PhraseQuery is