>we'd see this performance distinction between 50/51 rows even with >documentCache enabled -- and it seems we're not seeing that. Yes, I didn’t see any difference in the benchmark (or I didn’t notice it because of a more significant problem). However, correct me if I'm wrong - this logic will work on the first request before the entry exists in the document cache or when data is evicted from it, right? If so, it could still add some nasty numbers to the percentiles in real use cases.
On Tue, Jul 9, 2024 at 4:58 PM Michael Gibney <mich...@michaelgibney.net> wrote: > > I wondered the same thing, tbh. Looks like it was introduced with > issue SOLR-52, so it might be worth re-evaluating. A lot has changed > since then, I'd imagine! > > Still, if this were actively a problem (as opposed to just maybe not > helping as intended), we'd see this performance distinction between > 50/51 rows even with documentCache enabled -- and it seems we're not > seeing that. > > On Tue, Jul 9, 2024 at 7:54 AM Oleksandr Tkachuk <sasha547...@gmail.com> > wrote: > > > > I just tested your new pr and it helped. Thanks a lot. > > > > Little offtopic regarding > > https://github.com/apache/solr/blob/aec6e8f750037fea5f8d01dc49dabf28bf512d68/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L568-L569 > > Just curious, has anyone ever tested the effectiveness of this thing? > > Does it give at least one percent increase in performance? > > > > On Mon, Jul 8, 2024 at 10:39 PM Michael Gibney > > <mich...@michaelgibney.net> wrote: > > > > > > FYI: https://github.com/apache/solr/pull/2551 > > > > > > On Mon, Jul 8, 2024 at 9:55 AM Michael Gibney <mich...@michaelgibney.net> > > > wrote: > > > > > > > > Thanks for reporting back. Found the issue at last, including the > > > > magic number! Will post a fix for this shortly. > > > > > > > > https://github.com/apache/solr/blob/aec6e8f750037fea5f8d01dc49dabf28bf512d68/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L568-L569 > > > > > > > > On Mon, Jul 8, 2024 at 9:05 AM Oleksandr Tkachuk > > > > <sasha547...@gmail.com> wrote: > > > > > > > > > > Hello. > > > > > Unfortunately it didn't help. Still a huge difference between 50 vs 51 > > > > > and disabling enableLazyFieldLoading in solrconfig.xml still helps. > > > > > > > > > > solr-impl 10.0.0-SNAPSHOT 011d713a884559e3efeaa69e4f3c8dd8e630ff22 > > > > > [snapshot build, details omitted] > > > > > cat solr/core/src/java/org/apache/solr/search/SolrDocumentFetcher.java > > > > > | head -370 | tail -13 > > > > > SolrDocumentStoredFieldVisitor(Set<String> toLoad, IndexReader > > > > > reader, int docId) { > > > > > super(toLoad); > > > > > this.docId = docId; > > > > > this.doc = getDocument(); > > > > > if (documentCache == null) { > > > > > // lazy loading makes no sense if we don't have a > > > > > `documentCache` > > > > > this.lazyFieldProducer = null; > > > > > this.addLargeFieldsLazily = false; > > > > > } else { > > > > > this.lazyFieldProducer = > > > > > toLoad != null && enableLazyFieldLoading ? new > > > > > LazyDocument(reader, docId) : null; > > > > > this.addLargeFieldsLazily = !largeFields.isEmpty(); > > > > > } > > > > > > > > > > > > > > > On Wed, Jun 26, 2024 at 5:10 AM Michael Gibney > > > > > <mich...@michaelgibney.net> wrote: > > > > > > > > > > > > FYI: > > > > > > https://issues.apache.org/jira/browse/SOLR-17349 > > > > > > https://github.com/apache/solr/pull/2535 > > > > > > > > > > > > I'm curious whether this helps! > > > > > > > > > > > > On Fri, Jun 21, 2024 at 3:08 PM Oleksandr Tkachuk > > > > > > <sasha547...@gmail.com> wrote: > > > > > > > > > > > > > > >If you're set up to try running a patched version on your data, > > > > > > > >I'm curious to know if this will help. > > > > > > > I'll be happy to do this. > > > > > > > > > > > > > > >But maybe it's not so much a magic threshold as arbitrary, and > > > > > > > >specific to the data you're evaluating over. > > > > > > > Well, I tested this case on the collection that you remembered, > > > > > > > with a > > > > > > > large number of fields (564133 at this moment) and more documents > > > > > > > there (~68 million documents). The number of documents and their > > > > > > > content are significantly different there from where I tested > > > > > > > previously. And I can say that I was quickly able to reproduce the > > > > > > > problem with magic number 50(51), although not as noticeable as > > > > > > > in the > > > > > > > previous one. I confirmed this on absolutely any cardinality and > > > > > > > any > > > > > > > variance using hey (I’m more than sure that it will be reproduced > > > > > > > on > > > > > > > any other benchmark). Although qtime did not differ visually or > > > > > > > did > > > > > > > not differ as much as we would like, with the intensity of > > > > > > > queries the > > > > > > > difference grows significantly (but still easier to reproduce on > > > > > > > fl= > > > > > > > data that has high unevenness and low cardinality), for example: > > > > > > > Huge cardinality, values almost completely unique: > > > > > > > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json" > > > > > > > 'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=50' > > > > > > > Slowest: 0.0024 secs > > > > > > > Fastest: 0.0009 secs > > > > > > > Average: 0.0013 secs > > > > > > > Requests/sec: 3768.1874 > > > > > > > > > > > > > > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json" > > > > > > > 'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=51' > > > > > > > Slowest: 0.0018 secs > > > > > > > Fastest: 0.0007 secs > > > > > > > Average: 0.0009 secs > > > > > > > Requests/sec: 5620.4994 > > > > > > > > > > > > > > Just 1.5x diff > > > > > > > > > > > > > > > > > > > > > "fld2":["v1",30501964,"v2",4202177,"v3",210886] : > > > > > > > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json" > > > > > > > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=50' > > > > > > > Slowest: 0.1198 secs > > > > > > > Fastest: 0.0013 secs > > > > > > > Average: 0.0019 secs > > > > > > > Requests/sec: 2641.0227 > > > > > > > > > > > > > > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json" > > > > > > > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=51' > > > > > > > Slowest: 0.0051 secs > > > > > > > Fastest: 0.0003 secs > > > > > > > Average: 0.0003 secs > > > > > > > Requests/sec: 14610.4688 > > > > > > > > > > > > > > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json" > > > > > > > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=50' > > > > > > > Slowest: 0.0059 secs > > > > > > > Fastest: 0.0008 secs > > > > > > > Average: 0.0010 secs > > > > > > > Requests/sec: 4795.5539 > > > > > > > > > > > > > > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json" > > > > > > > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=51' > > > > > > > Slowest: 0.0010 secs > > > > > > > Fastest: 0.0003 secs > > > > > > > Average: 0.0003 secs > > > > > > > Requests/sec: 14726.7978 > > > > > > > > > > > > > > 4-6x diff > > > > > > > > > > > > > > On Fri, Jun 21, 2024 at 4:59 PM Michael Gibney > > > > > > > <mich...@michaelgibney.net> wrote: > > > > > > > > > > > > > > > > Interesting! If turning off lazy field loading helps, I think I > > > > > > > > have a > > > > > > > > trivial patch that may fix this (i.e. without requiring the > > > > > > > > workaround > > > > > > > > of disabling lazy field loading -- which, as you say, makes no > > > > > > > > sense > > > > > > > > to have in effect without the documentCache). The only thing > > > > > > > > that had > > > > > > > > been stopping me from suggesting this patch right off the bat > > > > > > > > was the > > > > > > > > "magic" threshold of 50, which I couldn't explain at all. But > > > > > > > > maybe > > > > > > > > it's not so much a magic threshold as arbitrary, and specific > > > > > > > > to the > > > > > > > > data you're evaluating over. I'll open an issue/PR more narrowly > > > > > > > > scoped to the change. I'd say you could open the issue, except > > > > > > > > I still > > > > > > > > don't fully understand the connection between the change I'm > > > > > > > > considering and the behavior you're seeing -- just that they > > > > > > > > seem very > > > > > > > > likely to be connected. If you're set up to try running a > > > > > > > > patched > > > > > > > > version on your data, I'm curious to know if this will help. > > > > > > > > > > > > > > > > On Thu, Jun 20, 2024 at 6:16 PM Oleksandr Tkachuk > > > > > > > > <sasha547...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > FYI: There is a solution in the last paragraph, but I still > > > > > > > > > ran your > > > > > > > > > tests, since the solution was found by "Cut and Try" and > > > > > > > > > there is no > > > > > > > > > deep understanding. > > > > > > > > > > > > > > > > > > >I wonder what would happen if you fully bypassed the query > > > > > > > > > >cache (i.e., `q={!cache=false}product_type:"1"`? > > > > > > > > > It does not help, there is not even one millisecond of > > > > > > > > > difference in both cases. > > > > > > > > > > > > > > > > > > >I recall that previously you had a very large number of > > > > > > > > > >dynamic fields. Is that the case here as well? And if so, > > > > > > > > > >are the dynamic fields mostly stored? docValues? > > > > > > > > > This is another collection, I’ll get to the one with many > > > > > > > > > many fields later :)) > > > > > > > > > If this is the ~correct way to count the number of fields, > > > > > > > > > then this > > > > > > > > > collection has the following number of fields: > > > > > > > > > curl -s > > > > > > > > > "http://localhost:8983/solr/XXX/admin/luke?numTerms=0" | grep > > > > > > > > > '"type"' | wc -l > > > > > > > > > 121 > > > > > > > > > Of these, 88 have docvalues enabled and 33 stored. > > > > > > > > > > > > > > > > > > As for the two fields used in query, here's how they are > > > > > > > > > defined in the schema. > > > > > > > > > <field name="product_id" type="plong" indexed="true" > > > > > > > > > stored="true"/> > > > > > > > > > <field name="product_type" type="pint" indexed="true" > > > > > > > > > stored="false"/> > > > > > > > > > <fieldType name="pint" class="solr.IntPointField" > > > > > > > > > docValues="true"/> > > > > > > > > > <fieldType name="plong" class="solr.LongPointField" > > > > > > > > > docValues="true"/> > > > > > > > > > > > > > > > > > > Changing fl= to something like a string field with > > > > > > > > > stored=true without > > > > > > > > > docvalues results in zero changes. > > > > > > > > > I also tried this simple query on string type fields (copying > > > > > > > > > the > > > > > > > > > field) and got the same result. I also tried it on fields > > > > > > > > > where the > > > > > > > > > cardinality was different - the spread was not 150 times, but > > > > > > > > > also > > > > > > > > > often noticeable. In addition, I still do not fully > > > > > > > > > understand the > > > > > > > > > logic of this behavior > > > > > > > > > ("product_type":["3",1069282,"2",710042,"1",13702]) if I do: > > > > > > > > > 1) q=product_type:"1" rows=50 - qtime 150ms > > > > > > > > > 2) q=product_type:"1" rows=51 - qtime 0ms > > > > > > > > > 3) q=product_type:"2" rows=50 - qtime 3ms > > > > > > > > > 4) q=product_type:"2" rows=51 - qtime 0ms > > > > > > > > > 5) q=product_type:"3" rows=50 - qtime 1ms > > > > > > > > > 6) q=product_type:"3" rows=51 - qtime 0ms > > > > > > > > > I checked on other fields and get the same behavior - the > > > > > > > > > fewer > > > > > > > > > documents contain a given value, the slower the query becomes. > > > > > > > > > If I can provide any more information, I will be glad. > > > > > > > > > > > > > > > > > > The problem was solved by turning off enableLazyFieldLoading. > > > > > > > > > I am > > > > > > > > > very surprised that this functionality continues to work when > > > > > > > > > document > > > > > > > > > cache is disabled and I thought that this parameter was > > > > > > > > > intended only > > > > > > > > > for it. In addition, we received an improvement in avg and > > > > > > > > > 95% on many > > > > > > > > > other types of queries, as well as some reduction in CPU > > > > > > > > > load. Are > > > > > > > > > there any consequences or disadvantages of such a decision? > > > > > > > > > If not, > > > > > > > > > then perhaps it is worth paying attention to this problem. > > > > > > > > > > > > > > > > > > On Thu, Jun 20, 2024 at 10:13 PM Michael Gibney > > > > > > > > > <mich...@michaelgibney.net> wrote: > > > > > > > > > > > > > > > > > > > > I've been unable to reproduce anything like this behavior. > > > > > > > > > > If you're > > > > > > > > > > really getting queryResultCache hits for these, then the > > > > > > > > > > field > > > > > > > > > > type/etc of the field you're querying on shouldn't make a > > > > > > > > > > difference. > > > > > > > > > > type/etc of the return field (product_id) would be more > > > > > > > > > > likely to > > > > > > > > > > matter. I wonder what would happen if you fully bypassed > > > > > > > > > > the query > > > > > > > > > > cache (i.e., `q={!cache=false}product_type:"1"`? > > > > > > > > > > > > > > > > > > > > I recall that previously you had a very large number of > > > > > > > > > > dynamic > > > > > > > > > > fields. Is that the case here as well? And if so, are the > > > > > > > > > > dynamic > > > > > > > > > > fields mostly stored? docValues? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Jun 14, 2024 at 7:29 AM Oleksandr Tkachuk > > > > > > > > > > <sasha547...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Initial data: > > > > > > > > > > > Doc count: 1793026 > > > > > > > > > > > Field: "product_type", point int, indexed true, stored > > > > > > > > > > > false, > > > > > > > > > > > docvalues true. Values: > > > > > > > > > > > "facet_fields":{ > > > > > > > > > > > "product_type":["3",1069282,"2",710042,"1",13702] > > > > > > > > > > > }, > > > > > > > > > > > Single shard, single instance. > > > > > > > > > > > > > > > > > > > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json" > > > > > > > > > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=51' > > > > > > > > > > > Summary: > > > > > > > > > > > Total: 0.6374 secs > > > > > > > > > > > Slowest: 0.0043 secs > > > > > > > > > > > Fastest: 0.0003 secs > > > > > > > > > > > Average: 0.0006 secs > > > > > > > > > > > Requests/sec: 15688.5755 > > > > > > > > > > > > > > > > > > > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json" > > > > > > > > > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=50' > > > > > > > > > > > Summary: > > > > > > > > > > > Total: 101.3246 secs > > > > > > > > > > > Slowest: 0.2048 secs > > > > > > > > > > > Fastest: 0.0564 secs > > > > > > > > > > > Average: 0.1007 secs > > > > > > > > > > > Requests/sec: 98.6927 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1) I've already played with queryResultWindowSize and > > > > > > > > > > > queryResultMaxDocsCached by setting different, high and > > > > > > > > > > > low values and > > > > > > > > > > > this is probably not what I'm looking for since it gave a > > > > > > > > > > > <few > > > > > > > > > > > milliseconds difference in query performance > > > > > > > > > > > 2) Checked on different versions of solr (9.6.1 and > > > > > > > > > > > 8.7.0) - no > > > > > > > > > > > significant changes > > > > > > > > > > > 3) Tried changing the field type to string - zero > > > > > > > > > > > performance changes > > > > > > > > > > > 4) In both cases I see successful lookups in > > > > > > > > > > > queryResultCache > > > > > > > > > > > 5) Enabling documentCache solves the problem in this case > > > > > > > > > > > (rows<=50), > > > > > > > > > > > but introduces many other performance issues so it > > > > > > > > > > > doesn't seem like a > > > > > > > > > > > viable option.