FST codec for *infix* queries. No luck so far.

2022-04-22 Thread Mikhail Khludnev
Hello, Devs! I tried to introduce a custom index to speedup *infix* queries. Note: I'm interested in cases where EdgeNGram is not an option. For example, if the term 'foobar' is stored in a block at position 200, and 'bar' at 100. I try to put the following suffixes in FST: foobar->[200] oobar->[2

Re: [JENKINS] Lucene » Lucene-NightlyTests-9.1 - Build # 42 - Unstable!

2022-04-22 Thread Dawid Weiss
And, for the record - indeed enwiki contains an odd field with a super-long term that looks like this: 13:24:08.000 {substc|}}}{{{1}}}|{{{p1n|}}}={{{p1v|}}}|{{{p2n|}}}={{{p2v|}}}|{{{p3n|}}}={{{p3v|}}}|{{{p4n|}}}={{{p4v|}}}|{{{p5n|}}}={{{p5v|}}}|{{{p6n|}}}={{{p6v|}}}|{{{p7n|}}}={{{p7v|}}}|{{{p

Re: [JENKINS] Lucene » Lucene-NightlyTests-9.1 - Build # 42 - Unstable!

2022-04-22 Thread Dawid Weiss
This actually reproduces (if you download enwiki). I wonder if we should tune LineFileDocs so that it avoids trying to add humongous terms. D. On Wed, Apr 20, 2022 at 3:42 AM Apache Jenkins Server wrote: > > Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-9.1/42/ > > 1 tes