200-300 docs might be too few to get significant gain. With a 400M index it's worth experimenting with skipping about a million of docs. In simplified params I mean defType=lucene&df=description. debugQuery might expose some details as well. As far as I understand this feature works with large segments since it skips a block of a segment, not a segment (?).
On Mon, Feb 5, 2024 at 8:04 PM rajani m <rajinima...@gmail.com> wrote: > The "numFound" value is 200-300 docs difference when compared to the query > without "minExactFound" param. The collection has over 400m records so > testing the feature on a large collection. The numFoundExact param in the > response is consistently false which tells me the feature is functioning > but the results(qtime) are just off, not as expected. > > Would a type of query parser matter?I tested without the secondary sort, > even without it there is no improvement in the query time latency and is > still more than the query without this param. > > > > On Mon, Feb 5, 2024 at 10:34 AM Mikhail Khludnev <m...@apache.org> wrote: > > > Hello, > > How many matches do you have in both cases? > > I see there's a second sorting expression, it might not comply with the > > requirements. > > I'd rather start from the simple single query parser, just for the > > experiments. > > Note: I never tried it myself. > > > > On Mon, Feb 5, 2024 at 6:20 PM rajani m <rajinima...@gmail.com> wrote: > > > > > I ran performance tests with different query sets and the results look > no > > > good, it is adding to the latency around ~15% instead of reducing or > even > > > matching. Not sure if I am missing something in the config or it is an > > > issue. > > > > > > Here is an example query *without* WAND query parameter > > > select?&fl=id,ext_id&start=0&q.op=OR&sort=score desc,ext_id > > > asc&rows=10&q=white flowers card&defType=edismax&qf=keywords > description > > > title > > > vs > > > *With* WAND query parameter > > > select?&fl=id,ext_id&start=0&q.op=OR&sort=score desc,ext_id > > > asc&rows=10&q=white flowers card&defType=edismax&qf=keywords > description > > > title*&minExactCount=10* > > > > > > On Thu, Feb 1, 2024 at 8:36 AM rajani m <rajinima...@gmail.com> wrote: > > > > > > > Hi Ishan, > > > > I have looked into that doc, and it looks like the solr version > has > > to > > > > be >8.8 and the config needed is to add the query parameter > > > "&minExactCount=k" > > > > where k is 10 or 100 depending on the accuracy of the first k docs. > > > > > > > > I ran a query performance test using an internal tool, with k set to > 10 > > > > and 100, which barely showed any difference in query time latency, I > > > > didn't expect that so I was wondering if there is any configuration I > > > > missed. > > > > > > > > I will run a couple more tests with different query sets meanwhile > and > > > dig > > > > further into implementation of the feature to see if I am missing any > > > > config here. Appreciate any suggestions. > > > > > > > > Thanks, > > > > Rajani > > > > > > > > On Thu, Feb 1, 2024 at 12:53 AM Ishan Chattopadhyaya < > > > > ichattopadhy...@gmail.com> wrote: > > > > > > > >> Is it possible to benchmark the query performance across a larger > set > > of > > > >> queries? You can leverage Solr Bench, if needed. > > > >> https://github.com/fullstorydev/solr-bench > > > >> > > > >> On Thu, 1 Feb, 2024, 11:20 am Ishan Chattopadhyaya, < > > > >> ichattopadhy...@gmail.com> wrote: > > > >> > > > >> > Some documentation is here > > > >> > > > > >> > > > > > > https://solr.apache.org/guide/8_6/common-query-parameters.html#minexactcount-parameter > > > >> > > > > >> > On Thu, 1 Feb, 2024, 9:53 am rajani m, <rajinima...@gmail.com> > > wrote: > > > >> > > > > >> >> Hi All, > > > >> >> > > > >> >> To leverage the query time improvements that come with the > Block > > > MAX > > > >> >> WAND > > > >> >> feature, what are the required configurations? > > > >> >> > > > >> >> I am on solr 9.1.1 version. As per docs, including > > > "minExactCount=100" > > > >> >> query param should do it, however I don't see any drop in query > > time, > > > >> it > > > >> >> is > > > >> >> more or less the same. Am I missing something? > > > >> >> > > > >> >> The queries I tested with are standard ones with edismax as query > > > >> parser > > > >> >> and query text is converted into boolean clauses and query has 2 > > > boost > > > >> >> params by date and popularity field. I included the > "minExactCount" > > > >> set to > > > >> >> as low as 10 and 100 and increased to 1000 but didn't see key > > change > > > in > > > >> >> query time, it was about the same. > > > >> >> > > > >> >> Would including boost or use of edismax parser not benefit with > > > block > > > >> MAX > > > >> >> WAND? Example query /select?q=((white) AND (roses OR > > > >> >> jasmine))&defType=edismax&qf=keywords description > > > >> >> title&pf2=title&bf=recip(ms(NOW,datefield),3.16e-11,1,1)^2.0 > > > >> >> > > > >> >> > > > >> >> Thank you, > > > >> >> Rajani > > > >> >> > > > >> > > > > >> > > > > > > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev > > > -- Sincerely yours Mikhail Khludnev