>If you're set up to try running a patched version on your data, I'm curious to 
>know if this will help.
I'll be happy to do this.

>But maybe it's not so much a magic threshold as arbitrary, and specific to the 
>data you're evaluating over.
Well, I tested this case on the collection that you remembered, with a
large number of fields (564133 at this moment) and more documents
there (~68 million documents). The number of documents and their
content are significantly different there from where I tested
previously. And I can say that I was quickly able to reproduce the
problem with magic number 50(51), although not as noticeable as in the
previous one. I confirmed this on absolutely any cardinality and any
variance using hey (I’m more than sure that it will be reproduced on
any other benchmark). Although qtime did not differ visually or did
not differ as much as we would like, with the intensity of queries the
difference grows significantly (but still easier to reproduce on fl=
data that has high unevenness and low cardinality), for example:
Huge cardinality, values almost completely unique:
./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=50'
  Slowest:      0.0024 secs
  Fastest:      0.0009 secs
  Average:      0.0013 secs
  Requests/sec: 3768.1874

./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=51'
  Slowest:      0.0018 secs
  Fastest:      0.0007 secs
  Average:      0.0009 secs
  Requests/sec: 5620.4994

Just 1.5x diff


"fld2":["v1",30501964,"v2",4202177,"v3",210886] :
./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=50'
  Slowest:      0.1198 secs
  Fastest:      0.0013 secs
  Average:      0.0019 secs
  Requests/sec: 2641.0227

./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=51'
  Slowest:      0.0051 secs
  Fastest:      0.0003 secs
  Average:      0.0003 secs
  Requests/sec: 14610.4688

./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=50'
  Slowest:      0.0059 secs
  Fastest:      0.0008 secs
  Average:      0.0010 secs
  Requests/sec: 4795.5539

./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=51'
  Slowest:      0.0010 secs
  Fastest:      0.0003 secs
  Average:      0.0003 secs
  Requests/sec: 14726.7978

4-6x diff

On Fri, Jun 21, 2024 at 4:59 PM Michael Gibney
<mich...@michaelgibney.net> wrote:
>
> Interesting! If turning off lazy field loading helps, I think I have a
> trivial patch that may fix this (i.e. without requiring the workaround
> of disabling lazy field loading -- which, as you say, makes no sense
> to have in effect without the documentCache). The only thing that had
> been stopping me from suggesting this patch right off the bat was the
> "magic" threshold of 50, which I couldn't explain at all. But maybe
> it's not so much a magic threshold as arbitrary, and specific to the
> data you're evaluating over. I'll open an issue/PR more narrowly
> scoped to the change. I'd say you could open the issue, except I still
> don't fully understand the connection between the change I'm
> considering and the behavior you're seeing -- just that they seem very
> likely to be connected. If you're set up to try running a patched
> version on your data, I'm curious to know if this will help.
>
> On Thu, Jun 20, 2024 at 6:16 PM Oleksandr Tkachuk <sasha547...@gmail.com> 
> wrote:
> >
> > FYI: There is a solution in the last paragraph, but I still ran your
> > tests, since the solution was found by "Cut and Try"  and there is no
> > deep understanding.
> >
> > >I wonder what would happen if you fully bypassed the query cache (i.e., 
> > >`q={!cache=false}product_type:"1"`?
> > It does not help, there is not even one millisecond of difference in both 
> > cases.
> >
> > >I recall that previously you had a very large number of dynamic fields. Is 
> > >that the case here as well? And if so, are the dynamic fields mostly 
> > >stored? docValues?
> > This is another collection, I’ll get to the one with many many fields later 
> > :))
> > If this is the ~correct way to count the number of fields, then this
> > collection has the following number of fields:
> > curl -s "http://localhost:8983/solr/XXX/admin/luke?numTerms=0"; | grep
> > '"type"' | wc -l
> > 121
> > Of these, 88 have docvalues enabled and 33 stored.
> >
> > As for the two fields used in query, here's how they are defined in the 
> > schema.
> >   <field name="product_id" type="plong" indexed="true" stored="true"/>
> >   <field name="product_type" type="pint" indexed="true" stored="false"/>
> >   <fieldType name="pint" class="solr.IntPointField" docValues="true"/>
> >   <fieldType name="plong" class="solr.LongPointField" docValues="true"/>
> >
> > Changing fl= to something like a string field with stored=true without
> > docvalues results in zero changes.
> > I also tried this simple query on string type fields (copying the
> > field) and got the same result. I also tried it on fields where the
> > cardinality was different - the spread was not 150 times, but also
> > often noticeable. In addition, I still do not fully understand the
> > logic of this behavior
> > ("product_type":["3",1069282,"2",710042,"1",13702]) if I do:
> > 1) q=product_type:"1" rows=50 - qtime 150ms
> > 2) q=product_type:"1" rows=51 - qtime 0ms
> > 3) q=product_type:"2" rows=50 - qtime 3ms
> > 4) q=product_type:"2" rows=51 - qtime 0ms
> > 5) q=product_type:"3" rows=50 - qtime 1ms
> > 6) q=product_type:"3" rows=51 - qtime 0ms
> > I checked on other fields and get the same behavior - the fewer
> > documents contain a given value, the slower the query becomes.
> > If I can provide any more information, I will be glad.
> >
> > The problem was solved by turning off enableLazyFieldLoading. I am
> > very surprised that this functionality continues to work when document
> > cache is disabled and I thought that this parameter was intended only
> > for it. In addition, we received an improvement in avg and 95% on many
> > other types of queries, as well as some reduction in CPU load. Are
> > there any consequences or disadvantages of such a decision? If not,
> > then perhaps it is worth paying attention to this problem.
> >
> > On Thu, Jun 20, 2024 at 10:13 PM Michael Gibney
> > <mich...@michaelgibney.net> wrote:
> > >
> > > I've been unable to reproduce anything like this behavior. If you're
> > > really getting queryResultCache hits for these, then the field
> > > type/etc of the field you're querying on shouldn't make a difference.
> > > type/etc of the return field (product_id) would be more likely to
> > > matter. I wonder what would happen if you fully bypassed the query
> > > cache (i.e., `q={!cache=false}product_type:"1"`?
> > >
> > > I recall that previously you had a very large number of dynamic
> > > fields. Is that the case here as well? And if so, are the dynamic
> > > fields mostly stored? docValues?
> > >
> > >
> > >
> > > On Fri, Jun 14, 2024 at 7:29 AM Oleksandr Tkachuk <sasha547...@gmail.com> 
> > > wrote:
> > > >
> > > > Initial data:
> > > > Doc count: 1793026
> > > > Field: "product_type", point int, indexed true, stored false,
> > > > docvalues true. Values:
> > > >  "facet_fields":{
> > > >       "product_type":["3",1069282,"2",710042,"1",13702]
> > > >     },
> > > > Single shard, single instance.
> > > >
> > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json"
> > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=51'
> > > > Summary:
> > > >   Total:        0.6374 secs
> > > >   Slowest:      0.0043 secs
> > > >   Fastest:      0.0003 secs
> > > >   Average:      0.0006 secs
> > > >   Requests/sec: 15688.5755
> > > >
> > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json"
> > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=50'
> > > > Summary:
> > > >   Total:        101.3246 secs
> > > >   Slowest:      0.2048 secs
> > > >   Fastest:      0.0564 secs
> > > >   Average:      0.1007 secs
> > > >   Requests/sec: 98.6927
> > > >
> > > >
> > > > 1) I've already played with queryResultWindowSize and
> > > > queryResultMaxDocsCached by setting different, high and low values and
> > > > this is probably not what I'm looking for since it gave a <few
> > > > milliseconds difference in query performance
> > > > 2) Checked on different versions of solr (9.6.1 and 8.7.0) - no
> > > > significant changes
> > > > 3) Tried changing the field type to string - zero performance changes
> > > > 4) In both cases I see successful lookups in queryResultCache
> > > > 5) Enabling documentCache solves the problem in this case (rows<=50),
> > > > but introduces many other performance issues so it doesn't seem like a
> > > > viable option.

Reply via email to