Hello Ivana.
I think the change caused this is [LUCENE-9207] Don't build SpanQuery in
QueryBuilder - ASF JIRA (apache.org)
<https://issues.apache.org/jira/browse/LUCENE-9207>
Also, please check the last comments in Don't build SpanQuery in
QueryBuilder [LUCENE-9207] · Issue #10247 · apache/lucene · GitHub
<https://github.com/apache/lucene/issues/10247> where I attempted to
discuss a way to reproduce old buggish nested span in fancy new intervals
queries.
So, far it's stuck, I don't know for what reason.

On Tue, Mar 21, 2023 at 1:59 PM Ivana Pranjic <ivana.pran...@gulp.de.invalid>
wrote:

> Hi Solr community,
>
> We are currently trying to upgrade Solr from v8 to 9 and we have stumbled
> upon an issue - the queries that we are using for search are resulting in
> much more clauses being generated than before, hitting the
> maxBooleanClauses limit for some simple queries (even if we increase the
> limit). I'll try to describe our issue as concise as possible:
> When we search for a phrase like "SAP S/4HANA" and use synonym expansion,
> in the parsedQuery in Solr 8, we can see this:
>
> parsedquery_toString":"+(spanNear([spanOr([body:sap-anwend,
> body:sap-anwendungsbereich, body:sap-anwendungsbereich,
> body:sap-bereich, body:sap-erfahr, body:sap-expertis ...
>
> etc, whereas  the same search in Solr9 yields a different parsedQuery:
>
> "parsedquery_toString":"+((body:\"sap-anwend sap business suit 4
> hana\" body:\"sap-anwend sap business suit 4 sap hana\"
> body:\"sap-anwend sap business suit for hana\" ...
>
> which when we analyzed, we noticed that it created a combination for
> all synonyms of term SAP and all synonyms for S/4HANA. Since only the
> term SAP alone has about 300 synonyms in our synonyms.txt, combined
> with synonyms for S/4HANA, the number of clauses got up to over 2000.
> If there are more terms and fields that we search for, this easily
> explodes into a giant parsedQuery and we get the maxBooleanClauses
> error.
>
> Looking at the documentation and code, we could not figure out why
> there is a difference in Solr 9, what was exactly changed in the
> implementation, and what happened to the spanNear and spanOr. The
> queries that we are using in Solr8 were not having performance issues
> so far.
>
> What are we missing? Is there a way to avoid creating combinations of
> synonyms when searching for phrases? It seems to not be happening when
> doing a regular search for both terms SAP S/4HANA, without quotes.
>
> One thing that we probably should do is minimize the number of
> synonyms in our file, or give up on searching for multiword phrases.
>
> I hope there is someone that can enlighten us in this matter :)
>
> Thank you!
>
>
>
>
> Herzliche Grüße / Best regards
>
> *Ivana Pranjic*
> Software Developer
>
> *GULP Information Services GmbH*
>
>
> Telefon: +49 89 500316717
>
> E-Mail: ivana.pran...@gulp.de
>
>
> *GULP - experts united*
> www.gulp.de - a Randstad company
>
> GULP Information Services GmbH
> Sitz: München, Amtsgericht München HRB 207 941
> Geschäftsführer: Michel Verdoold (Vors.), Arie Blom
>
> [image: Trustpilot Human score]
> <
> https://de.trustpilot.com/review/www.gulp.de?utm_medium=Trustbox&utm_source=EmailSignature4
> >
>    [image: Trustpilot Stars]
> <
> https://de.trustpilot.com/review/www.gulp.de?utm_medium=Trustbox&utm_source=EmailSignature4
> >
>    [image: Trustpilot Logo]
> <
> https://de.trustpilot.com/review/www.gulp.de?utm_medium=Trustbox&utm_source=EmailSignature4
> >
>
> <https://www.facebook.com/GULP.Jobs> <https://twitter.com/gulp_news>
> <https://www.xing.com/pages/gulp>
> <https://www.linkedin.com/company/gulp-experts-united>
> <https://www.instagram.com/gulp_karriere>
>


-- 
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!

Reply via email to