Re: A question on PhraseQuery and slop

2021-12-13 Thread Michael Sokolov
I wonder if the Analysis chain could be involved. If those stop words ("is") are removed without leaving a hole somehow, then that could explain? On Mon, Dec 13, 2021 at 9:35 AM Michael McCandless wrote: > > Hello Claude, > > Hmm, that is interesting that you see slop=2 matching query "quick fox"

Re: A question on PhraseQuery and slop

2021-12-13 Thread Michael McCandless
Hello Claude, Hmm, that is interesting that you see slop=2 matching query "quick fox" against document "the fox is quick". Edit distance (Levenshtein) is a bit tricky because it might include a transposition (just swapping the two words) as edit distance 1 OR 2. So maybe Lucene's PhraseQuery is

A question on PhraseQuery and slop

2021-12-10 Thread Claude Lepere
Hello. The explanation of https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/PhraseQuery.html#getSlop writes that the edit distance between "quick fox" and "the fox is quick" would be a