FYI: There is a solution in the last paragraph, but I still ran your tests, since the solution was found by "Cut and Try" and there is no deep understanding.
>I wonder what would happen if you fully bypassed the query cache (i.e., >`q={!cache=false}product_type:"1"`? It does not help, there is not even one millisecond of difference in both cases. >I recall that previously you had a very large number of dynamic fields. Is >that the case here as well? And if so, are the dynamic fields mostly stored? >docValues? This is another collection, I’ll get to the one with many many fields later :)) If this is the ~correct way to count the number of fields, then this collection has the following number of fields: curl -s "http://localhost:8983/solr/XXX/admin/luke?numTerms=0" | grep '"type"' | wc -l 121 Of these, 88 have docvalues enabled and 33 stored. As for the two fields used in query, here's how they are defined in the schema. <field name="product_id" type="plong" indexed="true" stored="true"/> <field name="product_type" type="pint" indexed="true" stored="false"/> <fieldType name="pint" class="solr.IntPointField" docValues="true"/> <fieldType name="plong" class="solr.LongPointField" docValues="true"/> Changing fl= to something like a string field with stored=true without docvalues results in zero changes. I also tried this simple query on string type fields (copying the field) and got the same result. I also tried it on fields where the cardinality was different - the spread was not 150 times, but also often noticeable. In addition, I still do not fully understand the logic of this behavior ("product_type":["3",1069282,"2",710042,"1",13702]) if I do: 1) q=product_type:"1" rows=50 - qtime 150ms 2) q=product_type:"1" rows=51 - qtime 0ms 3) q=product_type:"2" rows=50 - qtime 3ms 4) q=product_type:"2" rows=51 - qtime 0ms 5) q=product_type:"3" rows=50 - qtime 1ms 6) q=product_type:"3" rows=51 - qtime 0ms I checked on other fields and get the same behavior - the fewer documents contain a given value, the slower the query becomes. If I can provide any more information, I will be glad. The problem was solved by turning off enableLazyFieldLoading. I am very surprised that this functionality continues to work when document cache is disabled and I thought that this parameter was intended only for it. In addition, we received an improvement in avg and 95% on many other types of queries, as well as some reduction in CPU load. Are there any consequences or disadvantages of such a decision? If not, then perhaps it is worth paying attention to this problem. On Thu, Jun 20, 2024 at 10:13 PM Michael Gibney <mich...@michaelgibney.net> wrote: > > I've been unable to reproduce anything like this behavior. If you're > really getting queryResultCache hits for these, then the field > type/etc of the field you're querying on shouldn't make a difference. > type/etc of the return field (product_id) would be more likely to > matter. I wonder what would happen if you fully bypassed the query > cache (i.e., `q={!cache=false}product_type:"1"`? > > I recall that previously you had a very large number of dynamic > fields. Is that the case here as well? And if so, are the dynamic > fields mostly stored? docValues? > > > > On Fri, Jun 14, 2024 at 7:29 AM Oleksandr Tkachuk <sasha547...@gmail.com> > wrote: > > > > Initial data: > > Doc count: 1793026 > > Field: "product_type", point int, indexed true, stored false, > > docvalues true. Values: > > "facet_fields":{ > > "product_type":["3",1069282,"2",710042,"1",13702] > > }, > > Single shard, single instance. > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json" > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=51' > > Summary: > > Total: 0.6374 secs > > Slowest: 0.0043 secs > > Fastest: 0.0003 secs > > Average: 0.0006 secs > > Requests/sec: 15688.5755 > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json" > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=50' > > Summary: > > Total: 101.3246 secs > > Slowest: 0.2048 secs > > Fastest: 0.0564 secs > > Average: 0.1007 secs > > Requests/sec: 98.6927 > > > > > > 1) I've already played with queryResultWindowSize and > > queryResultMaxDocsCached by setting different, high and low values and > > this is probably not what I'm looking for since it gave a <few > > milliseconds difference in query performance > > 2) Checked on different versions of solr (9.6.1 and 8.7.0) - no > > significant changes > > 3) Tried changing the field type to string - zero performance changes > > 4) In both cases I see successful lookups in queryResultCache > > 5) Enabling documentCache solves the problem in this case (rows<=50), > > but introduces many other performance issues so it doesn't seem like a > > viable option.