Re: Block MAX WAND feature use

2024-02-15 Thread Tomás Fernández Löbbe
One thing you can use to influence ranking while still allowing the optimization is to use Rank Fields[1]. Multiple field queries should be OK, but I don't remember off the top of my head if DisMax queries work, I believe they do, but I don't know why you wouldn't be seeing an improvement. [1] ht

Re: Block MAX WAND feature use

2024-02-15 Thread Mikhail Khludnev
Don't know exactly. It might be sum, product or any other combination func. Another thought: MinExactCount optimization always brings top score just skipping weaker matches. But if you introduce rescoring after extracting BM25 top hits, it loses precision: Think about a top rating doc, which has fe

Re: Block MAX WAND feature use

2024-02-15 Thread rajani m
If the boosts are multiple function queries such as the following[1] then the boost query would be a sum function surrounding them, is it? I missed that one. [1] "sum(product(popularity,2),1.0)" and "recip(ms(NOW/HOUR,date),3.163e-11,1,1)" I will post the question on the dev channel regarding wh

Re: Block MAX WAND feature use

2024-02-14 Thread Mikhail Khludnev
Hello, Please check inline below. On Thu, Feb 15, 2024 at 2:11 AM rajani m wrote: > Yes, rerank works as an alternative, but the rerank only supports one boost > query, correct? If there are multiple boost conditions such as boost by > date, season and popularity, putting all of them into one co

Re: Block MAX WAND feature use

2024-02-14 Thread rajani m
Yes, rerank works as an alternative, but the rerank only supports one boost query, correct? If there are multiple boost conditions such as boost by date, season and popularity, putting all of them into one complex boost query is a hard problem, rerank by LTR can help. Thank you for that pointer.

Re: Block MAX WAND feature use

2024-02-14 Thread Mikhail Khludnev
Cool. Btw can you rerank results with the corresponding boost query? On Wed, Feb 14, 2024 at 8:46 PM rajani m wrote: > Milkhail, > > Thanks for that pointer to test with a simple query. It works perfectly > with lucene query parser, I see qtime drop by 7 times with this param. > > With edismax

Re: Block MAX WAND feature use

2024-02-14 Thread rajani m
Milkhail, Thanks for that pointer to test with a simple query. It works perfectly with lucene query parser, I see qtime drop by 7 times with this param. With edismax query, it works with certain caveats that "qf" (query fields) must have only one field and the query must not have boost/bf param

Re: Block MAX WAND feature use

2024-02-06 Thread rajani m
> With a 400M index it's worth experimenting with skipping about a million of docs. Is there a param that allows setting how many docs to skip? "minExactCount '' which decides how many docs it should care to score and I tested that with 100, 1000 and 2000 with latency only increased. Alessandro

Re: Block MAX WAND feature use

2024-02-05 Thread Alessandro Benedetti
It would be interesting to see the level pf fragmentation of each index indeed... I.e. How many segments per core, in a collection On Tue, 6 Feb 2024, 06:59 Mikhail Khludnev, wrote: > 200-300 docs might be too few to get significant gain. With a 400M index > it's worth experimenting with skippin

Re: Block MAX WAND feature use

2024-02-05 Thread Mikhail Khludnev
200-300 docs might be too few to get significant gain. With a 400M index it's worth experimenting with skipping about a million of docs. In simplified params I mean defType=lucene&df=description. debugQuery might expose some details as well. As far as I understand this feature works with large segm

Re: Block MAX WAND feature use

2024-02-05 Thread rajani m
The "numFound" value is 200-300 docs difference when compared to the query without "minExactFound" param. The collection has over 400m records so testing the feature on a large collection. The numFoundExact param in the response is consistently false which tells me the feature is functioning but

Re: Block MAX WAND feature use

2024-02-05 Thread Mikhail Khludnev
Hello, How many matches do you have in both cases? I see there's a second sorting expression, it might not comply with the requirements. I'd rather start from the simple single query parser, just for the experiments. Note: I never tried it myself. On Mon, Feb 5, 2024 at 6:20 PM rajani m wrote: >

Re: Block MAX WAND feature use

2024-02-05 Thread rajani m
I ran performance tests with different query sets and the results look no good, it is adding to the latency around ~15% instead of reducing or even matching. Not sure if I am missing something in the config or it is an issue. Here is an example query *without* WAND query parameter select?&fl=id,e

Re: Block MAX WAND feature use

2024-02-01 Thread rajani m
Hi Ishan, I have looked into that doc, and it looks like the solr version has to be >8.8 and the config needed is to add the query parameter "&minExactCount=k" where k is 10 or 100 depending on the accuracy of the first k docs. I ran a query performance test using an internal tool, with k set t

Re: Block MAX WAND feature use

2024-01-31 Thread Ishan Chattopadhyaya
Is it possible to benchmark the query performance across a larger set of queries? You can leverage Solr Bench, if needed. https://github.com/fullstorydev/solr-bench On Thu, 1 Feb, 2024, 11:20 am Ishan Chattopadhyaya, < ichattopadhy...@gmail.com> wrote: > Some documentation is here > https://solr.

Re: Block MAX WAND feature use

2024-01-31 Thread Ishan Chattopadhyaya
Some documentation is here https://solr.apache.org/guide/8_6/common-query-parameters.html#minexactcount-parameter On Thu, 1 Feb, 2024, 9:53 am rajani m, wrote: > Hi All, > > To leverage the query time improvements that come with the Block MAX WAND > feature, what are the required configuration