It also apparently doesn't allow emails big enough for the debug output. Here's a link to a Google Doc with the output in: https://docs.google.com/document/d/1TUPE4Qkc-zjKGCJnn0_YVMOfaLgzlF2YCFF9sz4LNcQ/edit?usp=sharing
I hope that works well enough, if not we'll have to work out some other option.. On Fri, 8 Nov 2024 at 14:01, Gus Heck <gus.h...@gmail.com> wrote: > The mailing list usually strips out attachments. You'll need to paste it > into the body of the email. > > On Fri, Nov 8, 2024 at 7:16 AM Dominic Humphries > <domi...@adzuna.com.invalid> > wrote: > > > Fair enough! See attached, if that doesn't work I'll send it inline... > > > > On Thu, 7 Nov 2024 at 18:40, Gus Heck <gus.h...@gmail.com> wrote: > > > >> Yes, seeing the final expanded query may shed light on where the time is > >> going, so voluminous output is good. Feel free to anonymize any customer > >> names or sensitive information with "<REDACTED>" or similar. > >> > >> On Thu, Nov 7, 2024 at 12:21 PM Dominic Humphries > >> <domi...@adzuna.com.invalid> wrote: > >> > >> > Yes, sorry, not cloud, afaik it's single-sharded. > >> > > >> > Same query with facet fields removed takes just as long to run. Adding > >> the > >> > debug to the request generates a rather large amount of output, I > >> believe > >> > due to synonyms - I can send them if it's useful, but it's rather a > lot? > >> > > >> > On Thu, 7 Nov 2024 at 15:37, Gus Heck <gus.h...@gmail.com> wrote: > >> > > >> > > Ok so that's 7M docs at 3k/doc... a relatively reasonable index (at > >> > least > >> > > if the hardware is reasonable, and you say it did work on 8.11 so > >> that's > >> > > probably fine). > >> > > > >> > > By your reply I assume it's single sharded and not using > >> cloud/zookeeper? > >> > > > >> > > The request you showed has a lot of facets on it. How much > difference > >> > does > >> > > it make to the situation if you just send the query without the > >> facets? > >> > > > >> > > Also add &debug=query and send us the debug output from the header > >> when > >> > you > >> > > do that... > >> > > > >> > > > >> > > > >> > > On Thu, Nov 7, 2024 at 9:31 AM Dominic Humphries > >> > > <domi...@adzuna.com.invalid> > >> > > wrote: > >> > > > >> > > > Sure: > >> > > > "index":{ > >> > > > "numDocs":7349353, > >> > > > "maxDoc":7834951, > >> > > > "deletedDocs":485598, > >> > > > "segmentCount":31, > >> > > > "segmentsFileSizeInBytes":2727, > >> > > > "sizeInBytes":22066572844, > >> > > > "size":"20.55 GB" > >> > > > > >> > > > On Thu, 7 Nov 2024 at 13:27, Gus Heck <gus.h...@gmail.com> wrote: > >> > > > > >> > > > > This is interesting, can you give us a feel for the > >> size/structure of > >> > > the > >> > > > > index (# of documents, size of index, # of shards)? > >> > > > > > >> > > > > On Thu, Nov 7, 2024 at 7:52 AM Dominic Humphries > >> > > > > <domi...@adzuna.com.invalid> > >> > > > > wrote: > >> > > > > > >> > > > > > An update, I found the part of the query that's making > >> everything > >> > so > >> > > > > slow: > >> > > > > > the q param > >> > > > > > > >> > > > > > When we have > >> > > > > > "q":"(carroll_county OR Aldi OR Cashier OR Kohls) AND > NOT > >> > > > > (internship > >> > > > > > OR intern OR graduate)", > >> > > > > > the search is very slow, taking 20-something seconds > >> > > > > > > >> > > > > > When it's just > >> > > > > > "q":"(carroll_county OR Aldi OR Cashier OR Kohls)", > >> > > > > > the search is blazing fast, coming back in under a second. So > it > >> > > > appears > >> > > > > > it's something triggered by the NOT that's both taking all the > >> > time, > >> > > > and > >> > > > > > not getting caught by the timeAllowed limit > >> > > > > > > >> > > > > > Full query below: > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > select?f.contract_type.facet.limit=2&fl=*&f.company_id.facet.mincount=1&qt=edismax&f.contract_time.facet.missing=false&f.location_struct.facet.limit=50&facet.date.end=NOW%2FDAY%2B1DAYS&ps=2&f.description.hl.snippets=2&stats.field=salary_avg_stats&facet.date.gap=%2B1DAY&pf=title&stats=true&_qtags=api_id%3Ab02dbf6d~784741%7CFCGI%3A%3AModel%3A%3AWWW%3A%3AJobsBase%3A%3ASearch%7C2781%7CCHOMkO6R7xGwlQ6bKp_SoQ&qs=5&f.contract_time.facet.limit=2&f.contract_time.facet.mincount=1&f.company_id.facet.missing=false&facet.date=%7B!key%3Dfreshness%7Dcreated&bq=(reply_on_adzuna%3Atrue%5E0.5)&f.contract_type.facet.mincount=1&wt=json&f.location_struct.facet.mincount=1&facet.date.hardend=true&f.category_id.facet.limit=50&timeAllowed=4900&f.contract_type.facet.missing=false&f.category_id.facet.mincount=1&sort=score+desc&q.alt=*%3A*&boost=boost_factor&f.company_id.facet.limit=50&facet.date.start=NOW%2FDAY-7DAYS&facet=false&facet.field=%7B!key%3Dlocation%3Aid%7Dlocation_struct&facet.field=%7B!key%3Dcategory%3Aid%7Dcategory_id&facet.field=contract_type&facet.field=contract_time&facet.field=%7B!key%3Dcompany%3Aid%7Dcompany_id&f.description.hl.fragsize=180&hl=false&rows=20&start=0&q=(carroll_county+OR+Aldi+OR+Cashier+OR+Kohls)+AND+NOT+(internship+OR+intern+OR+graduate)&fq=location_id%3A151946&fq=boosted%3A1&fq=%7B!cost%3D200%7Dsearch_category%3A0&fq=created%3A%5BNOW%2FDAY-14DAYS+TO+*%5D > >> > > > > > > >> > > > > > On Wed, 6 Nov 2024 at 17:00, Dominic Humphries < > >> domi...@adzuna.com > >> > > > >> > > > > wrote: > >> > > > > > > >> > > > > > > I spoke too soon, I figured out how to get VisualVM talking > to > >> > > solr. > >> > > > > Now > >> > > > > > > I'm just not sure what to do with it - what sorts of things > >> am I > >> > > > > looking > >> > > > > > > for? > >> > > > > > > > >> > > > > > > On Wed, 6 Nov 2024 at 16:40, Dominic Humphries < > >> > domi...@adzuna.com > >> > > > > >> > > > > > wrote: > >> > > > > > > > >> > > > > > >> Unfortunately I don't know Java anywhere near well enough > to > >> > know > >> > > my > >> > > > > way > >> > > > > > >> around a profiler or jstack. I've confirmed JMX is enabled > >> and I > >> > > can > >> > > > > > telnet > >> > > > > > >> to the port, but VisualVM fails to connect and gives me no > >> > reason > >> > > as > >> > > > > to > >> > > > > > >> why. > >> > > > > > >> > >> > > > > > >> I can post the query and result if that's useful - it > doesn't > >> > > return > >> > > > > any > >> > > > > > >> records so there's nothing to censor > >> > > > > > >> > >> > > > > > >> On Wed, 6 Nov 2024 at 15:36, Gus Heck <gus.h...@gmail.com> > >> > wrote: > >> > > > > > >> > >> > > > > > >>> If you have access to a test instance where the problem > can > >> be > >> > > > > > >>> reproduced, > >> > > > > > >>> attaching a profiler would be one way. Another cruder > >> method is > >> > > to > >> > > > > use > >> > > > > > >>> jstack to dump all the threads. > >> > > > > > >>> > >> > > > > > >>> Another way to tackle this is to help us reproduce your > >> > problem. > >> > > > Can > >> > > > > > you > >> > > > > > >>> share details about your query? Obviously, please don't > post > >> > > > anything > >> > > > > > >>> your > >> > > > > > >>> company wouldn't want public, but if you can share some > >> details > >> > > > that > >> > > > > > >>> would > >> > > > > > >>> be a start. > >> > > > > > >>> > >> > > > > > >>> The ideal thing would be to provide a minimum working > >> example > >> > of > >> > > > the > >> > > > > > >>> problem you are experiencing. > >> > > > > > >>> > >> > > > > > >>> On Wed, Nov 6, 2024 at 9:55 AM Dominic Humphries > >> > > > > > >>> <domi...@adzuna.com.invalid> > >> > > > > > >>> wrote: > >> > > > > > >>> > >> > > > > > >>> > I've tried both timeAllowed and cpuAllowed and neither > are > >> > > > > > restricting > >> > > > > > >>> the > >> > > > > > >>> > amount of time the queries take to run. I have a test > >> query > >> > > > that's > >> > > > > > >>> reliably > >> > > > > > >>> > taking 20-30 seconds, if there's any useful debug params > >> or > >> > > such > >> > > > I > >> > > > > > can > >> > > > > > >>> run > >> > > > > > >>> > to provide the information you want I'm happy to run > them > >> - > >> > I'm > >> > > > not > >> > > > > > >>> sure > >> > > > > > >>> > how to usefully interrogate solr for where its time is > >> being > >> > > > spent, > >> > > > > > >>> sorry > >> > > > > > >>> > > >> > > > > > >>> > Thanks > >> > > > > > >>> > > >> > > > > > >>> > On Wed, 6 Nov 2024 at 14:25, Gus Heck < > gus.h...@gmail.com > >> > > >> > > > wrote: > >> > > > > > >>> > > >> > > > > > >>> > > There are unit tests that seem to suggest that > >> timeAllowed > >> > > > still > >> > > > > > >>> works, > >> > > > > > >>> > can > >> > > > > > >>> > > you provide some more information about your use case? > >> > > > > Particularly > >> > > > > > >>> > > important is any information about where (what code) > >> your > >> > > > queries > >> > > > > > are > >> > > > > > >>> > > spending a lot of time in if you have it. > >> > > > > > >>> > > > >> > > > > > >>> > > On Wed, Nov 6, 2024 at 6:18 AM Dominic Humphries > >> > > > > > >>> > > <domi...@adzuna.com.invalid> > >> > > > > > >>> > > wrote: > >> > > > > > >>> > > > >> > > > > > >>> > > > Hi folks, > >> > > > > > >>> > > > > >> > > > > > >>> > > > we're testing Solr 9.7 to upgrade our existing 8.11 > >> > stack. > >> > > > > We're > >> > > > > > >>> > seeing a > >> > > > > > >>> > > > problem with long requests: we send > `timeAllowed=4900` > >> > > which > >> > > > > > works > >> > > > > > >>> fine > >> > > > > > >>> > > on > >> > > > > > >>> > > > the existing 8.11 and keeps requests to just a few > >> > seconds. > >> > > > > > >>> > > > > >> > > > > > >>> > > > With 9.7, however, the flag is basically ignored - > >> > requests > >> > > > can > >> > > > > > >>> take > >> > > > > > >>> > over > >> > > > > > >>> > > > 30 seconds whether the flag is present or not, which > >> is > >> > > > causing > >> > > > > > >>> higher > >> > > > > > >>> > > CPU > >> > > > > > >>> > > > load and slowing response times. > >> > > > > > >>> > > > > >> > > > > > >>> > > > I've tried setting the flag suggested in > >> > > > > > >>> > > > > >> > > > > > >>> > > > > >> > > > > > >>> > > > >> > > > > > >>> > > >> > > > > > >>> > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://solr.apache.org/guide/solr/latest/upgrade-notes/major-changes-in-solr-9.html#use-of-timeallowed > >> > > > > > >>> > > > - but even with solr.useExitableDirectoryReader set > we > >> > > still > >> > > > > > don't > >> > > > > > >>> get > >> > > > > > >>> > > the > >> > > > > > >>> > > > desired behaviour. > >> > > > > > >>> > > > > >> > > > > > >>> > > > Is there anything else I can try to get the old > >> behaviour > >> > > > back? > >> > > > > > >>> > > > > >> > > > > > >>> > > > Thanks > >> > > > > > >>> > > > > >> > > > > > >>> > > > >> > > > > > >>> > > > >> > > > > > >>> > > -- > >> > > > > > >>> > > http://www.needhamsoftware.com (work) > >> > > > > > >>> > > https://a.co/d/b2sZLD9 (my fantasy fiction book) > >> > > > > > >>> > > > >> > > > > > >>> > > >> > > > > > >>> > >> > > > > > >>> > >> > > > > > >>> -- > >> > > > > > >>> http://www.needhamsoftware.com (work) > >> > > > > > >>> https://a.co/d/b2sZLD9 (my fantasy fiction book) > >> > > > > > >>> > >> > > > > > >> > >> > > > > > > >> > > > > > >> > > > > > >> > > > > -- > >> > > > > http://www.needhamsoftware.com (work) > >> > > > > https://a.co/d/b2sZLD9 (my fantasy fiction book) > >> > > > > > >> > > > > >> > > > >> > > > >> > > -- > >> > > http://www.needhamsoftware.com (work) > >> > > https://a.co/d/b2sZLD9 (my fantasy fiction book) > >> > > > >> > > >> > >> > >> -- > >> http://www.needhamsoftware.com (work) > >> https://a.co/d/b2sZLD9 (my fantasy fiction book) > >> > > > > -- > http://www.needhamsoftware.com (work) > https://a.co/d/b2sZLD9 (my fantasy fiction book) >