Yes, sorry, not cloud, afaik it's single-sharded.

Same query with facet fields removed takes just as long to run. Adding the
debug to the request generates a rather large amount of output, I believe
due to synonyms - I can send them if it's useful, but it's rather a lot?

On Thu, 7 Nov 2024 at 15:37, Gus Heck <gus.h...@gmail.com> wrote:

> Ok so that's 7M docs at 3k/doc...  a relatively reasonable index (at least
> if the hardware is reasonable, and you say it did work on 8.11 so that's
> probably fine).
>
> By your reply I assume it's single sharded and not using cloud/zookeeper?
>
> The request you showed has a lot of facets on it. How much difference does
> it make to the situation if you just send the query without the facets?
>
> Also add &debug=query and send us the debug output from the header when you
> do that...
>
>
>
> On Thu, Nov 7, 2024 at 9:31 AM Dominic Humphries
> <domi...@adzuna.com.invalid>
> wrote:
>
> > Sure:
> >       "index":{
> >         "numDocs":7349353,
> >         "maxDoc":7834951,
> >         "deletedDocs":485598,
> >         "segmentCount":31,
> >         "segmentsFileSizeInBytes":2727,
> >         "sizeInBytes":22066572844,
> >         "size":"20.55 GB"
> >
> > On Thu, 7 Nov 2024 at 13:27, Gus Heck <gus.h...@gmail.com> wrote:
> >
> > > This is interesting, can you give us a feel for the size/structure of
> the
> > > index (# of documents, size of index, # of shards)?
> > >
> > > On Thu, Nov 7, 2024 at 7:52 AM Dominic Humphries
> > > <domi...@adzuna.com.invalid>
> > > wrote:
> > >
> > > > An update, I found the part of the query that's making everything so
> > > slow:
> > > > the q param
> > > >
> > > > When we have
> > > >       "q":"(carroll_county OR Aldi OR Cashier OR Kohls) AND NOT
> > > (internship
> > > > OR intern OR graduate)",
> > > > the search is very slow, taking 20-something seconds
> > > >
> > > > When it's just
> > > >       "q":"(carroll_county OR Aldi OR Cashier OR Kohls)",
> > > > the search is blazing fast, coming back in under a second. So it
> > appears
> > > > it's something triggered by the NOT that's both taking all the time,
> > and
> > > > not getting caught by the timeAllowed limit
> > > >
> > > > Full query below:
> > > >
> > > >
> > >
> >
> select?f.contract_type.facet.limit=2&fl=*&f.company_id.facet.mincount=1&qt=edismax&f.contract_time.facet.missing=false&f.location_struct.facet.limit=50&facet.date.end=NOW%2FDAY%2B1DAYS&ps=2&f.description.hl.snippets=2&stats.field=salary_avg_stats&facet.date.gap=%2B1DAY&pf=title&stats=true&_qtags=api_id%3Ab02dbf6d~784741%7CFCGI%3A%3AModel%3A%3AWWW%3A%3AJobsBase%3A%3ASearch%7C2781%7CCHOMkO6R7xGwlQ6bKp_SoQ&qs=5&f.contract_time.facet.limit=2&f.contract_time.facet.mincount=1&f.company_id.facet.missing=false&facet.date=%7B!key%3Dfreshness%7Dcreated&bq=(reply_on_adzuna%3Atrue%5E0.5)&f.contract_type.facet.mincount=1&wt=json&f.location_struct.facet.mincount=1&facet.date.hardend=true&f.category_id.facet.limit=50&timeAllowed=4900&f.contract_type.facet.missing=false&f.category_id.facet.mincount=1&sort=score+desc&q.alt=*%3A*&boost=boost_factor&f.company_id.facet.limit=50&facet.date.start=NOW%2FDAY-7DAYS&facet=false&facet.field=%7B!key%3Dlocation%3Aid%7Dlocation_struct&facet.field=%7B!key%3Dcategory%3Aid%7Dcategory_id&facet.field=contract_type&facet.field=contract_time&facet.field=%7B!key%3Dcompany%3Aid%7Dcompany_id&f.description.hl.fragsize=180&hl=false&rows=20&start=0&q=(carroll_county+OR+Aldi+OR+Cashier+OR+Kohls)+AND+NOT+(internship+OR+intern+OR+graduate)&fq=location_id%3A151946&fq=boosted%3A1&fq=%7B!cost%3D200%7Dsearch_category%3A0&fq=created%3A%5BNOW%2FDAY-14DAYS+TO+*%5D
> > > >
> > > > On Wed, 6 Nov 2024 at 17:00, Dominic Humphries <domi...@adzuna.com>
> > > wrote:
> > > >
> > > > > I spoke too soon, I figured out how to get VisualVM talking to
> solr.
> > > Now
> > > > > I'm just not sure what to do with it - what sorts of things am I
> > > looking
> > > > > for?
> > > > >
> > > > > On Wed, 6 Nov 2024 at 16:40, Dominic Humphries <domi...@adzuna.com
> >
> > > > wrote:
> > > > >
> > > > >> Unfortunately I don't know Java anywhere near well enough to know
> my
> > > way
> > > > >> around a profiler or jstack. I've confirmed JMX is enabled and I
> can
> > > > telnet
> > > > >> to the port, but VisualVM fails to connect and gives me no reason
> as
> > > to
> > > > >> why.
> > > > >>
> > > > >> I can post the query and result if that's useful - it doesn't
> return
> > > any
> > > > >> records so there's nothing to censor
> > > > >>
> > > > >> On Wed, 6 Nov 2024 at 15:36, Gus Heck <gus.h...@gmail.com> wrote:
> > > > >>
> > > > >>> If you have access to a test instance where the problem can be
> > > > >>> reproduced,
> > > > >>> attaching a profiler would be one way. Another cruder method is
> to
> > > use
> > > > >>> jstack to dump all the threads.
> > > > >>>
> > > > >>> Another way to tackle this is to help us reproduce your problem.
> > Can
> > > > you
> > > > >>> share details about your query? Obviously, please don't post
> > anything
> > > > >>> your
> > > > >>> company wouldn't want public, but if you can share some details
> > that
> > > > >>> would
> > > > >>> be a start.
> > > > >>>
> > > > >>> The ideal thing would be to provide a minimum working example of
> > the
> > > > >>> problem you are experiencing.
> > > > >>>
> > > > >>> On Wed, Nov 6, 2024 at 9:55 AM Dominic Humphries
> > > > >>> <domi...@adzuna.com.invalid>
> > > > >>> wrote:
> > > > >>>
> > > > >>> > I've tried both timeAllowed and cpuAllowed and neither are
> > > > restricting
> > > > >>> the
> > > > >>> > amount of time the queries take to run. I have a test query
> > that's
> > > > >>> reliably
> > > > >>> > taking 20-30 seconds, if there's any useful debug params or
> such
> > I
> > > > can
> > > > >>> run
> > > > >>> > to provide the information you want I'm happy to run them - I'm
> > not
> > > > >>> sure
> > > > >>> > how to usefully interrogate solr for where its time is being
> > spent,
> > > > >>> sorry
> > > > >>> >
> > > > >>> > Thanks
> > > > >>> >
> > > > >>> > On Wed, 6 Nov 2024 at 14:25, Gus Heck <gus.h...@gmail.com>
> > wrote:
> > > > >>> >
> > > > >>> > > There are unit tests that seem to suggest that timeAllowed
> > still
> > > > >>> works,
> > > > >>> > can
> > > > >>> > > you provide some more information about your use case?
> > > Particularly
> > > > >>> > > important is any information about where (what code) your
> > queries
> > > > are
> > > > >>> > > spending a lot of time in if you have it.
> > > > >>> > >
> > > > >>> > > On Wed, Nov 6, 2024 at 6:18 AM Dominic Humphries
> > > > >>> > > <domi...@adzuna.com.invalid>
> > > > >>> > > wrote:
> > > > >>> > >
> > > > >>> > > > Hi folks,
> > > > >>> > > >
> > > > >>> > > > we're testing Solr 9.7 to upgrade our existing 8.11 stack.
> > > We're
> > > > >>> > seeing a
> > > > >>> > > > problem with long requests: we send `timeAllowed=4900`
> which
> > > > works
> > > > >>> fine
> > > > >>> > > on
> > > > >>> > > > the existing 8.11 and keeps requests to just a few seconds.
> > > > >>> > > >
> > > > >>> > > > With 9.7, however, the flag is basically ignored - requests
> > can
> > > > >>> take
> > > > >>> > over
> > > > >>> > > > 30 seconds whether the flag is present or not, which is
> > causing
> > > > >>> higher
> > > > >>> > > CPU
> > > > >>> > > > load and slowing response times.
> > > > >>> > > >
> > > > >>> > > > I've tried setting the flag suggested in
> > > > >>> > > >
> > > > >>> > > >
> > > > >>> > >
> > > > >>> >
> > > > >>>
> > > >
> > >
> >
> https://solr.apache.org/guide/solr/latest/upgrade-notes/major-changes-in-solr-9.html#use-of-timeallowed
> > > > >>> > > > - but even with solr.useExitableDirectoryReader set we
> still
> > > > don't
> > > > >>> get
> > > > >>> > > the
> > > > >>> > > > desired behaviour.
> > > > >>> > > >
> > > > >>> > > > Is there anything else I can try to get the old behaviour
> > back?
> > > > >>> > > >
> > > > >>> > > > Thanks
> > > > >>> > > >
> > > > >>> > >
> > > > >>> > >
> > > > >>> > > --
> > > > >>> > > http://www.needhamsoftware.com (work)
> > > > >>> > > https://a.co/d/b2sZLD9 (my fantasy fiction book)
> > > > >>> > >
> > > > >>> >
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>> http://www.needhamsoftware.com (work)
> > > > >>> https://a.co/d/b2sZLD9 (my fantasy fiction book)
> > > > >>>
> > > > >>
> > > >
> > >
> > >
> > > --
> > > http://www.needhamsoftware.com (work)
> > > https://a.co/d/b2sZLD9 (my fantasy fiction book)
> > >
> >
>
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)
>

Reply via email to