On one hand that’s great news, on the other ot probably deserves a ticket but you need to have a very specific scenario where your index filters don’t match your query filters.
Also maybe spend some time putting together a reindexing plan. Solr can use multiple cores so you can index content simultaneously if it’s split up rather than a single indexing process. In Perl you can use forking via the process manager cpan module, most other languages do it as well (but not as well imo) > On Jan 11, 2023, at 8:47 AM, Mateusz Matela <mmat...@man.poznan.pl> wrote: > > After reindexing with SGF the document matches, as expected. > > Still, it looks like SGF was designed to work well when used only in query, > and it's just a bug revealed by an edge case. Shall I submit an issue to > https://github.com/apache/lucene ? > > W dniu 11.01.2023 o 13:09, Dave pisze: >> Yes then that is a problem, and I agree it should be intuitive that the >> quotes work without the modifier. I’m not familiar with the underlying code >> enough to know for sure what’s going on in this instance, but reinfecting >> the content with the filter I wonder would fix it? You can experiment with >> just that one document and see. >> >> Otherwise reindexing your content from scratch should have a plan, as >> upgrades/new filters to use become necessary. It’s definitely inconvenient >> but sometimes you got to do what you got to do, so better to be ready for it >> since a search index should always be considered temporary and replaceable, >> it’s not a database, it’s a search tool to search a data set, and if done >> with that in mind you put the index on replaceable hardware and expect/have >> a plan for them to simply die and be replaced >> >>>> On Jan 11, 2023, at 6:27 AM, Mateusz Matela <mmat...@man.poznan.pl> wrote: >>> >>> W dniu 11.01.2023 o 12:04, Dave pisze: >>>> Hmm. As an experiment what happens when you use a range of three or four >>>> with the quotes using the tilda in the query? >>> You mean query like "test polskie"~1 ? Yes, it does match. >>> >>> Unfortunately it's not a workaround I can use because the query is provided >>> by the users. It's quite intuitive for them to use quotes, but not >>> necessarily tildas. And if I added it artificially, it's a bit different >>> query, may not always be what the user wants. >>> >>>> Also generally o find it best to use the same filters for both indexing >>>> and query, just a personal preference, I know it’s not always possible >>>> however. >>> The problem here is that I'd need to reindex documents when synonyms >>> definitions change, which is quite inconvenient. >>> It should solve the problem if SGF did not increase the positions. Am I >>> correct to assume it's not the correct behavior and should be fixed? It >>> doesn't do that when there's only one token on the position it modifies, >>> for example: >>> >>> test(1) polski(2) -> test(1) pol(2) polski(2) >>> >>> Then the document does match. >>> >