There is a typo in my email. The term list should be like this:
- "bill" -> df = 1.879.324, ttf = 14.145.950 - "note" -> df = 8.479.826, ttf = 151.249.542 - "sale" -> df = 7.557.685, ttf = 12.0948.163 - "of" -> df = 21.244.060, ttf = 6.879.196.700 On Mon, Mar 25, 2024 at 8:56 AM Sjoerd Smeets <ssme...@gmail.com> wrote: > Hi, > > We are experiencing quite a performance decrease when searching for > phrases that have terms with a high ttf value. > > E.g. searching for "note of sale" is around 3 times slower (~10 sec) than > the "bill of sale" `(~3 sec). This behaviour is consistent and can be > reproduced als when we use other terms that have a high ttf. We are > querying the unstemmed index. > > Terms (numDocs: 26220184): > > - "bill" -> df = 1.879.324, ttf = 14.145.950 > - "note" -> df = 8.479.826, ttf = 151.249.542 > - "sale" -> df = 7.557.685, ttf = 12.0948.163 > - "bill" -> df = 21.244.060, ttf = 6.879.196.700 > > > Is this the expected behaviour or is there something that can be > tuned, like a cache setting? > > Thanks, > Sjoerd >