Re: CommonGramsQuery and proximity searches issue

2024-12-09 Thread Sjoerd Smeets
42d14ebfb08ceed3f2/solr/core/src/test/org/apache/solr/analysis/CommonGramsPhraseQueryTest.java#L30 > > > On Mon, Dec 9, 2024 at 6:02 PM Sjoerd Smeets wrote: > > > Hi, > > > > We are using a CommonGrams filter for both indexing and querying. We > > provide a list of words that shoul

CommonGramsQuery and proximity searches issue

2024-12-09 Thread Sjoerd Smeets
Hi, We are using a CommonGrams filter for both indexing and querying. We provide a list of words that should be treated as a common gram. In the list the following words exist: - new - york When we do queries like: "new amsterdam"~3 "old york"~3 "new york"~3 all give not the expected results.

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
d having a bad hair day, doing a GC, or something will > increase the risk of slowing things down, and probably increase the > variance in the overall response time. So definitely look at p90+ changes, > not just p50. > > On Mon, Mar 25, 2024 at 10:51 AM Sjoerd Smeets wrote: >

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
gine or with text analysis. With the downside you lose the ability to > match the individual terms. You could of course create a different field > for these significant phrases if its important. > > Best > -Doug > > On Mon, Mar 25, 2024 at 6:40 AM Sjoerd Smeets wrote: > > &

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
tf = 6.879.196.700 On Mon, Mar 25, 2024 at 8:56 AM Sjoerd Smeets wrote: > Hi, > > We are experiencing quite a performance decrease when searching for > phrases that have terms with a high ttf value. > > E.g. searching for "note of sale" is around 3 times slower (~10 s

Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
Hi, We are experiencing quite a performance decrease when searching for phrases that have terms with a high ttf value. E.g. searching for "note of sale" is around 3 times slower (~10 sec) than the "bill of sale" `(~3 sec). This behaviour is consistent and can be reproduced als when we use other t

Re: Heap Size Space and Span Queries

2022-12-19 Thread Sjoerd Smeets
Btw, is it worthwhile creating a ticket for SpanQueries going mental with the heap in certain cases? On Mon, Dec 19, 2022 at 8:51 AM Sjoerd Smeets wrote: > Thanks Uwe! There is no requirement yet for to have support for a FIRST > operator, bu I get your point. I'll use this as feedb

Re: Heap Size Space and Span Queries

2022-12-19 Thread Sjoerd Smeets
ut that's easy). I can help with a simple > implementation for it. > > Uwe > Am 19.12.2022 um 15:35 schrieb Sjoerd Smeets: > > Thanks everybody. I indeed have the memory dumps of these. I'm happy to > share that with you. These are pretty big files (3g compressed - 32g > uncompress

Re: Heap Size Space and Span Queries

2022-12-19 Thread Sjoerd Smeets
vs. IntervalsQuery performance and >>> characteristics, there's some possibly-relevant discussion on >>> LUCENE-9204: >>> >>> >>> https://issues.apache.org/jira/browse/LUCENE-9204?focusedCommentId=17352589#comment-17352589 >>

Heap Size Space and Span Queries

2022-12-14 Thread Sjoerd Smeets
Hi, I've implemented a Span Query parser and when running the below query, I'm seeing Heap Size Space messages on certain shards: o.a.s.s.HttpSolrCall null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space The span query that I'm running is the following: ((spanNear([unste

Re: Relevancy debugging - idf score

2021-12-05 Thread Sjoerd Smeets
Found it! I had to enable the ExactStatsCache Found a description over here. Thanks for pointing me in the right direction. https://solr.pl/en/2019/05/20/distributed-idf/ On Sun, Dec 5, 2021 at 11:09 AM Sjoerd Smeets wrote: > Hi Allessandro, > > Thanks for your reply! Yes, the doc

Re: Relevancy debugging - idf score

2021-12-05 Thread Sjoerd Smeets
21 at 11:02 AM Alessandro Benedetti wrote: > It's seems like the underline index changed. > Are those two documents in the same result set? > Is it just one query? > It's definitely curious, even if a commit happened search results are > consistent in one searcher. > >

Relevancy debugging - idf score

2021-12-05 Thread Sjoerd Smeets
Hi all, I'm debugging the relevancy scores of my query and I see the following for two documents hits. My question is, why is the idf score not the same for both documents? This is Solr 6.6. Any guidance would be much appreciated. Thanks! *Doc1* "71d72354eea23b9eae934ab616e8ce38de69d760": " 104

Testing highlighting span queries

2021-03-30 Thread Sjoerd Smeets
Hi all, Does anybody have some example code available that I could use to test Highlighting with Span queries? Thanks in advance, Sjoerd

Highlighting with Span queries

2021-03-26 Thread Sjoerd Smeets
Hi all, I am trying to get highlighting working with Span queries. My span query looks like (my query parser is an extension of the edismax queryparser): *spanNear([stemmed_text:tintin, stemmed_text:haddock], 4, false)* When I change the query to *+stemmed_text:tintin +stemmed_text:haddock* I g

Highlighting with Span queries

2021-03-26 Thread Sjoerd Smeets
Hi all, I am trying to get highlighting working with Span queries. My span query looks like (my query parser is an extension of the edismax queryparser): *spanNear([stemmed_text:tintin, stemmed_text:haddock], 4, false)* When I change the query to *+stemmed_text:tintin +stemmed_text:haddock* I g