Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
Hi, We are experiencing quite a performance decrease when searching for phrases that have terms with a high ttf value. E.g. searching for "note of sale" is around 3 times slower (~10 sec) than the "bill of sale" `(~3 sec). This behaviour is consistent and can be reproduced als when we use other t

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
There is a typo in my email. The term list should be like this: - "bill" -> df = 1.879.324, ttf = 14.145.950 - "note" -> df = 8.479.826, ttf = 151.249.542 - "sale" -> df = 7.557.685, ttf = 12.0948.163 - "of" -> df = 21.244.060, ttf = 6.879.196.700 On Mon, Mar 25, 2024 at 8:56 AM Sjo

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Doug Turnbull
As someone currently implementing a lot of positional search from scratch (in a different side-project), I can say it's totally expected behavior that high TTF / DF terms would be harder. To match the phrase there's simply more candidate documents and positions to intersect, so it's naturally a tou

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
Thanks Doug! Do you think adding more shards would help in this case? Putting the index in memory is not really possible as the index is up to 2.5Tb. We have SSD's though, so that is the closest we can get. We have 16 CPUs and configured it for 4 shards. Would splitting it up in more shards potent

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Doug Turnbull
It could help yeah with parallelizing it. With the tradeoff that you'll only be as fast as your slowest shard (ie tail latency). So more shards mean one shard having a bad hair day, doing a GC, or something will increase the risk of slowing things down, and probably increase the variance in the ove

Re: Slow performance for phrases with terms with high ttf

2024-03-25 Thread Sjoerd Smeets
Thanks Doug, we'll experiment and let you know how it went. On Mon, Mar 25, 2024 at 3:59 PM Doug Turnbull wrote: > It could help yeah with parallelizing it. With the tradeoff that you'll > only be as fast as your slowest shard (ie tail latency). So more shards > mean one shard having a bad hair

solr9.5.0/solrj9.5.0 bugs in shard request

2024-03-25 Thread Yue Yu
Hello, I found an issue in solr9.5.0/solrj9.5.0 regarding shard requests: As of now, the multi-shard requests are sent through Http2SolrClient, and this function composes the actual Jetty Request object: > private Request fillContentStream( Request req, Collection > streams, ModifiableSolrParams

edismax boost query(bq) with local params syntax

2024-03-25 Thread rajani m
Hi Solr Users, Could you help me with the bq syntax that supports boosting a term with caret ? Given the following boost query, I need to multiply the payload value with 10. bq={!payload_score f=field_name v

Re: edismax boost query(bq) with local params syntax

2024-03-25 Thread rajani m
ok, I figured, the syntax - bq= _query_:"" AND _val_:"" seems to be working. On Mon, Mar 25, 2024 at 1:44 PM rajani m wrote: > Hi Solr Users, > > Could you help me with the bq syntax that supports boosting a term with > caret >