On one hand that’s great news, on the other ot probably deserves a ticket but 
you need to have a very specific scenario where your index filters don’t match 
your query filters. 

Also maybe spend some time putting together a reindexing plan.  Solr can use 
multiple cores so you can index content simultaneously if it’s split up rather 
than a single indexing process. In Perl you can use forking via the process 
manager cpan module, most other languages do it as well (but not as well imo)



> On Jan 11, 2023, at 8:47 AM, Mateusz Matela <mmat...@man.poznan.pl> wrote:
> 
> After reindexing with SGF the document matches, as expected.
> 
> Still, it looks like SGF was designed to work well when used only in query, 
> and it's just a bug revealed by an edge case. Shall I submit an issue to 
> https://github.com/apache/lucene ?
> 
> W dniu 11.01.2023 o 13:09, Dave pisze:
>> Yes then that is a problem, and I agree it should be intuitive that the 
>> quotes work without the modifier.  I’m not familiar with the underlying code 
>> enough to know for sure what’s going on in this instance, but reinfecting 
>> the content with the filter I wonder would fix it? You can experiment with 
>> just that one document and see.
>> 
>> Otherwise reindexing your content from scratch should have a plan, as 
>> upgrades/new filters to use become necessary.  It’s definitely inconvenient 
>> but sometimes you got to do what you got to do, so better to be ready for it 
>> since a search index should always be considered temporary and replaceable, 
>> it’s not a database, it’s a search tool to search a data set, and if done 
>> with that in mind you put the index on replaceable hardware and expect/have 
>> a plan for them to simply die and be replaced
>> 
>>>> On Jan 11, 2023, at 6:27 AM, Mateusz Matela <mmat...@man.poznan.pl> wrote:
>>> 
>>> W dniu 11.01.2023 o 12:04, Dave pisze:
>>>> Hmm. As an experiment what happens when you use a range of three or four 
>>>> with the quotes using the tilda in the query?
>>> You mean query like "test polskie"~1 ? Yes, it does match.
>>> 
>>> Unfortunately it's not a workaround I can use because the query is provided 
>>> by the users. It's quite intuitive for them to use quotes, but not 
>>> necessarily tildas. And if I added it artificially, it's a bit different 
>>> query, may not always be what the user wants.
>>> 
>>>> Also generally o find it best to use the same filters for both indexing 
>>>> and query, just a personal preference, I know it’s not always possible 
>>>> however.
>>> The problem here is that I'd need to reindex documents when synonyms 
>>> definitions change, which is quite inconvenient.
>>> It should solve the problem if SGF did not increase the positions. Am I 
>>> correct to assume it's not the correct behavior and should be fixed? It 
>>> doesn't do that when there's only one token on the position it modifies, 
>>> for example:
>>> 
>>> test(1) polski(2) -> test(1) pol(2) polski(2)
>>> 
>>> Then the document does match.
>>> 
> 

Reply via email to