Core reload timeout on Solr 9

2022-12-05 Thread Nick Vladiceanu
Hello folks,

We’re running our SolrCloud cluster in Kubernetes. Recently we’ve upgraded from 
8.11 to 9.0 (and eventually to 9.1). 

Fully reindexed collections after upgrade, all looking good, no errors, 
response time improvements are noticed.

We have the following specs:
collection size:
22M docs, 1.3Kb doc size; ~28Gb total collection size at this point;
shards: 6 shards, each ~4,7Gb; 1 core per node;
nodes: 
30Gi of RAM, 
16 cores
96 nodes
Heap: 23Gb heap
JavaOpts: -Dsolr.modules=scripting,analysis-extras,ltr”
gcTune: -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:MaxGCPauseMillis=300 
-XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages 
-XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=10 -XX:ConcGCThreads=2 
-XX:MinHeapFreeRatio=2 -XX:MaxHeapFreeRatio=10


Problem

The problem we face is when we try to reload the collection, in sync mode we’re 
getting timed out or forever running task if reload executed in async mode:

curl “reload” output: https://justpaste.it/ap4d2 
ErrorReportingConcurrentUpdateSolrClient stacktrace (appears in the logs of 
some nodes): https://justpaste.it/aq3dw 

There are no issues on a newly created cluster if there is no incoming traffic 
to it. Once we start sending requests to the cluster, collection reload becomes 
impossible. Other collections (smaller) within the same cluster are reloading 
just fine.

In some cases, on some node the Old generation GC is kicking in and makes the 
entire cluster unstable, however, that doesn’t all the time when collection 
reload is timing out.

We’ve tried the rollback to 8.11 and everything works normally as it used to 
be, no errors with reload, no other errors in the logs during reload, etc.

We tried the following:
run 9.0, 9.1 on Java 11 and Java 17: same result;
lower cache warming, disable firstSearcher queries: same result;
increase heap size, tune gc: same result;
use apiv1 and apiv2 to issue reload commands: no difference;
sync vs async reload: either forever running task or timing out after 180 
seconds;

Did anyone face similar issues after upgrading to version 9 of Solr? Could you 
please advice where should we focus our attention while debugging this 
behavior? Any other advices/suggestions? 

Thank you


Best regards,
Nick Vladiceanu

Re: SOLR adding ,​ to strings erroneously

2022-12-05 Thread Thomas Corthals
Op za 3 dec. 2022 om 18:47 schreef Shawn Heisey :

> On 12/3/22 10:38, dmitri maziuk wrote:
> > On 2022-12-02 7:41 PM, Shawn Heisey wrote:
> >
> >> I'm curious as to why those entities are displaying as text instead
> >> of being interpreted by the browser as a zero-width space.
> >
> > I am curious as to why Matthew and I are apparently the only people
> > seeing it.
>
> I see it on my install, 9.2.0-SNAPSHOT compiled 2022/11/30, and it was
> also happening on a version compiled a few days earlier.  I have no idea
> when it first started happening.  I tend to glance at the logs every now
> and then, and only look closer at logs that pertain to whatever I am
> working on at that moment.  And I use solr.log a lot more than the
> logging tab in the UI ... this problem does not occur in the actual
> logfile.
>
> Thanks,
> Shawn
>

Earliest version I have running is 8.4.0 and I'm seeing it there in the
admin UI as well.

Thomas


Re: Core reload timeout on Solr 9

2022-12-05 Thread Houston Putman
I'm not sure this is the issue, but maybe its http2 vs http1.

Could you retry with the following set on the cluster?

-Dsolr.http1=true



On Mon, Dec 5, 2022 at 5:08 AM Nick Vladiceanu 
wrote:

> Hello folks,
>
> We’re running our SolrCloud cluster in Kubernetes. Recently we’ve upgraded
> from 8.11 to 9.0 (and eventually to 9.1).
>
> Fully reindexed collections after upgrade, all looking good, no errors,
> response time improvements are noticed.
>
> We have the following specs:
> collection size:
> 22M docs, 1.3Kb doc size; ~28Gb total collection size at this point;
> shards: 6 shards, each ~4,7Gb; 1 core per node;
> nodes:
> 30Gi of RAM,
> 16 cores
> 96 nodes
> Heap: 23Gb heap
> JavaOpts: -Dsolr.modules=scripting,analysis-extras,ltr”
> gcTune: -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:MaxGCPauseMillis=300
> -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages
> -XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=10 -XX:ConcGCThreads=2
> -XX:MinHeapFreeRatio=2 -XX:MaxHeapFreeRatio=10
>
>
> Problem
>
> The problem we face is when we try to reload the collection, in sync mode
> we’re getting timed out or forever running task if reload executed in async
> mode:
>
> curl “reload” output: https://justpaste.it/ap4d2 <
> https://justpaste.it/ap4d2>
> ErrorReportingConcurrentUpdateSolrClient stacktrace (appears in the logs
> of some nodes): https://justpaste.it/aq3dw 
>
> There are no issues on a newly created cluster if there is no incoming
> traffic to it. Once we start sending requests to the cluster, collection
> reload becomes impossible. Other collections (smaller) within the same
> cluster are reloading just fine.
>
> In some cases, on some node the Old generation GC is kicking in and makes
> the entire cluster unstable, however, that doesn’t all the time when
> collection reload is timing out.
>
> We’ve tried the rollback to 8.11 and everything works normally as it used
> to be, no errors with reload, no other errors in the logs during reload,
> etc.
>
> We tried the following:
> run 9.0, 9.1 on Java 11 and Java 17: same result;
> lower cache warming, disable firstSearcher queries: same result;
> increase heap size, tune gc: same result;
> use apiv1 and apiv2 to issue reload commands: no difference;
> sync vs async reload: either forever running task or timing out after 180
> seconds;
>
> Did anyone face similar issues after upgrading to version 9 of Solr? Could
> you please advice where should we focus our attention while debugging this
> behavior? Any other advices/suggestions?
>
> Thank you
>
>
> Best regards,
> Nick Vladiceanu


Re: Wired behavior of maxClauseCount restriction since upgrading to solr 9.1

2022-12-05 Thread Chris Hostetter


: today we updated solr to version 9.1 (lucene version 9.3)

Which version did you upgrade from?

: Since then we noticed plenty of TooManyNestedClauses in the logs. Our
: setting for maxClauseCount is 1024

Exactly where/how are you setting that?  There are 2 settings related to 
this...

https://solr.apache.org/guide/solr/latest/configuration-guide/configuring-solr-xml.html#global-maxbooleanclauses

https://solr.apache.org/guide/solr/latest/configuration-guide/caches-warming.html#maxbooleanclauses-element

: 
: 

FYI: You've shown us the definition of 'createdById' but your example 
queries use 'categoryId'

I'm going to assume for now that 'categoryId' is also a 'p_long_dv' 
field...

: But when I use the other field (categoryId) this fails:
: 
: curl -XGET http://localhost:8983/solr/myindex/select?q=+categoryId:(1 2 3
: ... 1024)
: 
: It works until 512 and starts failing from 513 clauses

that certainly smells like it might indicate a diff between the solr.xml
maxBooleanClauses setting and the solrconfig.xml maxBooleanClauses -- 
because one is enforced ath teh QueryParser level, and one is enforced at 
the query re-write level ... but off the top of my head i can't think of 
why your Point would trigger the query parser check but your String field 
wouldn't.

In addition to the previous questions i asked about, I would really be 
interested to see the stack traces of both exceptions: from querying your 
string field with 1025 clauses, and your point field with 513 clauses.


-Hoss
http://www.lucidworks.com/


Re: [External] : Re: Querying Solr Locally through Java API without using HttpClient

2022-12-05 Thread Chris Hostetter


: POC would be to add a function in the plugin.. which would query all the 
: documents locally (Say 100+ Million Documents) and update 1 or 2 fields 
: with a particular value.
: 
: As the plugin would be local to this core.. wanted to avoid HTTP calls.

I'm assuming here that you mean you want to write a *Solr* plugin (ie: A 
RequestHandler, SearchCOmponent, etc...) and from that code do a "query" 
to find documents.

In no circumstances would i suggest that using EmbeddedSolrServer, inside 
of a real solr server, is a good idea.

If you need your plugin to run on a single core, and iterate over docs 
from all shards, then you're going to need to make some sort of network 
call -- this is what things like the SearchHandler/QueryComponent do.

If you are ok with your plugin only handling the "local" docs, then you 
can just talk to the SolrIndexSearcher direcly -- the way things like 
the QueryComponent do in distrib=false mode.

If you also planning to *update* these docs, then you're going to need to 
be very careful in your code to check if you are running on the leader 
cores of each replica, so you don't have multiple replicas trying to make 
the same updates (you'll also need some way to ensure that your plugin 
gets "executed" on every leader (ie: running on every shard leader is a 
requirement, not just a liimiation)

But ultimatley you've asked a very vague question about a very complicated 
concept -- and i would urge you to take a step back, describe your actual 
use cases (how are the documents selected? what kinds of updates are you 
doing? when will this plugin run? etc...) in more details so more 
useful/specific advice can be given...

https://people.apache.org/~hossman/#xyproblem

XY Problem

Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341



-Hoss
http://www.lucidworks.com/


Re: Wired behavior of maxClauseCount restriction since upgrading to solr 9.1

2022-12-05 Thread michael dürr
Hi Hoss,

I'm happy to provide some details as I still do not really understand the
difference to the situation before.
In case something like categoryId:[1 TO 1] also gets converted to some
boolean term, then it's clear to me.
Otherwise I do not understand why half the boolean clauses (512 + 1)
already cause that exception.

Here the details you asked for:

* I upgraded from 8.11.1 to 9.1. I observed the behavior for a completely
rebuild index (solr version 9.1 / lucene version 9.3)
* maxBooleanClauses is only configured in solrconfig.xml (1024) but not in
solr.xml.
* Sorry for the confusion about the field definition. As you already
assumed correctly: 'categoryId' is also a 'p_long_dv'
* Stacktrace for String field ("id"). For better readability I replaced the
original query by "1 2 ... 1025":

  2022-12-06 07:08:46.625 ERROR (qtp1530880511-27) [  portal]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException:
org.apache.solr.search.SyntaxError: Cannot parse ' +id:(1 2 ... 1025 )':
too many boolean clauses
org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError:
Cannot parse ' +id:(1 2 ... 1025 )': too many boolean clauses
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:217)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:384)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:224)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2865)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:887)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:606)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:250)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:218)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:213)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
~[solr-core-9.1.0.jar:9.1.0 aa4f3d98ab19c201e7f3c74cd14c99174148616d -
ishan - 2022-11-11 13:00:47]
at
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)
~[jetty-servlet-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:600)
~[jetty-security-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
~[jetty-serve