200-300 docs might be too few to get significant gain. With a 400M index
it's worth experimenting with skipping about a million of docs.
In simplified params I mean defType=lucene&df=description. debugQuery might
expose some details as well.
As far as I understand this feature works with large segments since it
skips a block of a segment, not a segment (?).

On Mon, Feb 5, 2024 at 8:04 PM rajani m <rajinima...@gmail.com> wrote:

> The "numFound" value is 200-300 docs difference when compared to the query
> without "minExactFound" param.  The collection has over 400m records so
> testing the feature on a large collection.  The numFoundExact param in the
> response is consistently false which tells me the feature is functioning
> but the results(qtime) are just off, not as expected.
>
> Would a type of query parser matter?I tested without the secondary sort,
> even without it there is no improvement in the query time latency and is
> still more than the query without this param.
>
>
>
> On Mon, Feb 5, 2024 at 10:34 AM Mikhail Khludnev <m...@apache.org> wrote:
>
> > Hello,
> > How many matches do you have in both cases?
> > I see there's a second sorting expression, it might not comply with the
> > requirements.
> > I'd rather start from the simple single query parser, just for the
> > experiments.
> > Note: I never tried it myself.
> >
> > On Mon, Feb 5, 2024 at 6:20 PM rajani m <rajinima...@gmail.com> wrote:
> >
> > > I ran performance tests with different query sets and the results look
> no
> > > good, it is adding to the latency around ~15% instead of reducing or
> even
> > > matching.  Not sure if I am missing something in the config or it is an
> > > issue.
> > >
> > > Here is an example query *without* WAND query parameter
> > > select?&fl=id,ext_id&start=0&q.op=OR&sort=score desc,ext_id
> > > asc&rows=10&q=white flowers card&defType=edismax&qf=keywords
> description
> > > title
> > > vs
> > > *With* WAND query parameter
> > > select?&fl=id,ext_id&start=0&q.op=OR&sort=score desc,ext_id
> > > asc&rows=10&q=white flowers card&defType=edismax&qf=keywords
> description
> > > title*&minExactCount=10*
> > >
> > > On Thu, Feb 1, 2024 at 8:36 AM rajani m <rajinima...@gmail.com> wrote:
> > >
> > > > Hi Ishan,
> > > >    I have looked into that doc, and it looks like the solr version
> has
> > to
> > > > be >8.8 and the config needed is to add the query parameter
> > > "&minExactCount=k"
> > > > where k is 10 or 100 depending on the accuracy of the first k docs.
> > > >
> > > > I ran a query performance test using an internal tool, with k set to
> 10
> > > > and 100, which barely showed any difference in query time latency, I
> > > > didn't expect that so I was wondering if there is any configuration I
> > > > missed.
> > > >
> > > > I will run a couple more tests with different query sets meanwhile
> and
> > > dig
> > > > further into implementation of the feature to see if I am missing any
> > > > config here. Appreciate any suggestions.
> > > >
> > > > Thanks,
> > > > Rajani
> > > >
> > > > On Thu, Feb 1, 2024 at 12:53 AM Ishan Chattopadhyaya <
> > > > ichattopadhy...@gmail.com> wrote:
> > > >
> > > >> Is it possible to benchmark the query performance across a larger
> set
> > of
> > > >> queries? You can leverage Solr Bench, if needed.
> > > >> https://github.com/fullstorydev/solr-bench
> > > >>
> > > >> On Thu, 1 Feb, 2024, 11:20 am Ishan Chattopadhyaya, <
> > > >> ichattopadhy...@gmail.com> wrote:
> > > >>
> > > >> > Some documentation is here
> > > >> >
> > > >>
> > >
> >
> https://solr.apache.org/guide/8_6/common-query-parameters.html#minexactcount-parameter
> > > >> >
> > > >> > On Thu, 1 Feb, 2024, 9:53 am rajani m, <rajinima...@gmail.com>
> > wrote:
> > > >> >
> > > >> >> Hi All,
> > > >> >>
> > > >> >>   To leverage the query time improvements that come with the
> Block
> > > MAX
> > > >> >> WAND
> > > >> >> feature, what are the required configurations?
> > > >> >>
> > > >> >> I am on solr 9.1.1 version. As per docs, including
> > > "minExactCount=100"
> > > >> >> query param should do it, however I don't see any drop in query
> > time,
> > > >> it
> > > >> >> is
> > > >> >> more or less the same. Am I missing something?
> > > >> >>
> > > >> >> The queries I tested with are standard ones with edismax as query
> > > >> parser
> > > >> >> and query text is converted into boolean clauses and query has 2
> > > boost
> > > >> >> params by date and popularity field. I included the
> "minExactCount"
> > > >> set to
> > > >> >> as low as 10 and 100 and increased to 1000 but didn't see key
> > change
> > > in
> > > >> >> query time, it was about the same.
> > > >> >>
> > > >> >>  Would including boost or use of edismax parser not benefit with
> > > block
> > > >> MAX
> > > >> >> WAND? Example query  /select?q=((white) AND (roses OR
> > > >> >> jasmine))&defType=edismax&qf=keywords description
> > > >> >> title&pf2=title&bf=recip(ms(NOW,datefield),3.16e-11,1,1)^2.0
> > > >> >>
> > > >> >>
> > > >> >> Thank you,
> > > >> >> Rajani
> > > >> >>
> > > >> >
> > > >>
> > > >
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to