Backport Netty 4.1.87.Final to Solr 8.11.3

2023-01-20 Thread Joseph Gonzalez
Hello,

I saw that Netty 4.1.87.Final was brought into SOLR 9.2 via 
https://issues.apache.org/jira/browse/SOLR-16626.

Can that change be backported to Solr 8.11.3 to resolve CVE-2022-41881?

Thanks,

Joe


Re: Backport Netty 4.1.87.Final to Solr 8.11.3

2023-01-20 Thread Jan Høydahl
I threw up a PR for a backport: https://github.com/apache/lucene-solr/pull/2676

Jan

> 20. jan. 2023 kl. 16:00 skrev Joseph Gonzalez :
> 
> Hello,
> 
> I saw that Netty 4.1.87.Final was brought into SOLR 9.2 via 
> https://issues.apache.org/jira/browse/SOLR-16626.
> 
> Can that change be backported to Solr 8.11.3 to resolve CVE-2022-41881?
> 
> Thanks,
> 
> Joe



Re: Solr Restarting frequently.

2023-01-20 Thread Vincenzo D'Amore
Hi, is this solrcloud deployed in kubernetes?

On Wed, Jan 18, 2023 at 11:05 AM Rohit Walecha  wrote:

> Hi,
>
> We have a 3 node *solr(8.8.0)* cluster deployed on multiple environments
> which is connected to a 3 node *zookeeper(3.6.2)* cluster And, we have
> been facing frequent restarts of solr cloud nodes since the last few
> months..tried to debug this and while looking into the logs and other stats
> we have been seeing that the node which has restarted says :
>
> *1. *
> 2023-01-04 21:50:09.186 WARN (zkConnectionManagerCallback-15-thread-1) [ ]
> o.a.s.c.c.ConnectionManager Watcher
> org.apache.solr.common.cloud.ConnectionManager@731cf36d name:
> ZooKeeperConnection
> Watcher:apache-solrcloud-zookeeper-0.apache-solrcloud-zookeeper-headless.production.svc.cluster.local:2181,apache-solrcloud-zookeeper-1.apache-solrcloud-zookeeper-headless.production.svc.cluster.local:2181,apache-solrcloud-zookeeper-2.apache-solrcloud-zookeeper-headless.production.svc.cluster.local:2181/
> got event WatchedEvent state:Disconnected type:None path:null path: null
> type: None
> which probably says *event state is either disconnected or expired*, and
> says following as a warning :
> WARN (zkConnectionManagerCallback-13-thread-1) [ ]
> o.a.s.c.c.ConnectionManager zkClient has disconnected
>
>
>
> *2*.
> Client session timed out, have not heard from server in 30018ms for
> sessionid 0x191fcbe0001 A session timeout from ZkClient inside Solr.
> *And 3.* 2023-01-04 21:50:10.685 INFO (ShutdownMonitor) [ ]
> o.a.s.c.ZkController Publish this node as DOWN... 2023-01-04 21:50:10.685
> INFO (ShutdownMonitor) [ ] o.a.s.c.ZkController Publish
> node=apache-solrcloud-0.apache-solrcloud-headless.production:8983_solr as
> DOWN
> Attached *050120223-solr-cloud-0.log*
>
>
>
> *Meanwhile zookeeper node says following the time at which solr node gets
> restarted : *
>
> 2023-01-15 07:11:44,349 [myid:2] - WARN  
> [NIOWorkerThread-2:ZooKeeperServer@1384] - Connection request from old client 
> /10.70.26.0:54584; will be dropped if server is in r-o mode
> 2023-01-15 07:11:44,350 [myid:2] - INFO  
> [CommitProcessor:2:LearnerSessionTracker@116] - Committing global session 
> 0x200042f19cf130f
> 2023-01-15 07:11:44,352 [myid:2] - INFO  
> [RequestThrottler:QuorumZooKeeperServer@159] - Submitting global closeSession 
> request for session 0x200042f19cf130f
>
>
> Now we are at a point where *we know that when the solr node is getting 
> restarted, who is is pushed down the node and as we can see in the logs at 
> [#2]* which says something like Client session timed out and it is a session 
> which is getting established between solr node and zookeeper also  while 
> debugging this issue we have went through a series of issues reported in the 
> current version of *zookeeper *we are using which in gist says about slower 
> leader election and zookeeper nodes getting restarted and the whole zookeeper 
> cluster going down while a leader is getting unhealthy/stopped/restarted and 
> leader election happening again which is taking a long time which leads to 
> client sessions are getting timed out during that period of time.
>
> We have tried to replicate the same on the local env by setting up a solr and 
> zookeeper cluster by forcefully restarting/stopping leader zookeeper nodes 
> and we have got something like : *have-not-heard-back-local-cluster.log *and 
> We could replicate [#2].
>
> Seeking help here..to find out what could be the possible reason for these 
> frequent restarts of solr cloud nodes.
> *Regards.
> *
>
>

-- 
Vincenzo D'Amore


Re: What does "OK" mean in the status of the response of the ping handler?

2023-01-20 Thread Matthew Castrigno

Thank you Shawn.

My configuration is standalone mode.

We are going to use HAproxy  and need a metric to determine when to switch 
between two servers. The severs are kept "in sync" with each other by copying 
the cores from one to another. This works for our use case which only indexes 
new content on a schedule which we control.

Since sending this message I read that I can set up my own ping handler. There 
appears to be none in the default configuration, but the core does respond to 
the ping endpoint (/admin/ping) so there must be some implict request handler 
configuration for that endpoint.

So, I guess my first question was is that response to /admin/ping  telling me? 
(what exactly does "status":"OK" mean)
"responseHeader": {
"zkConnected": null,
"status": 0,
"QTime": 0,
"params": {
"q": "{!lucene}*:*",
"distrib": "false",
"df": "_text_",
"rows": "10",
"echoParams": "all",
"indent": "true",
"rid": "-6"
}
},
"status": "OK"
}

If I setup my own ping handler and configure it to do a query. Does this 
indicate any greater health that just sending a query and getting back a 200 
response?

I am considering this example from the web:




/search
some test query


all
article


server-enabled.txt



What do you recommend for ping handler request?

Also, I am wondering about the healthcheckFile, I can't find any information 
about what I can put in there other than enabling/disabling the ping endpoint. 
Where is there more information on this fiile? Search the doc for 
healthcheckFile returns No results found for query "healthcheckFile"

Thank you for your insights.



[cid:1b41aba7-9b3b-4ff1-88f0-9f1d6aaad553]

Matthew Castrigno

IHT Developer II

St. Luke’s Health System

•  208-859-4276
•  castr...@slhs.org


From: Shawn Heisey 
Sent: Thursday, January 19, 2023 7:26 PM
To: users@solr.apache.org 
Subject: Re: What does "OK" mean in the status of the response of the ping 
handler?

On 1/19/23 12: 32, Matthew Castrigno wrote: > Can I assume my index/core is not 
corrupt to the extent that SOLR can > detect such condition? > Is this the best 
heath indicator? > > https: //urldefense. com/v3/__https: //solr. apache. 
org/guide/solr/latest/deployment-guide/ping. 
html*ping-api-examples__;Iw!!FkC3_z_N!LmxefpYhLsy8Qy0bSEuC8-rndzi-PlTTbrLNROpVKY9K7DegR0ziz9BbaTFRHldvw3NGj8ZEPUNOkeg$
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside the St. Luke's email system.

ZjQcmQRYFpfptBannerEnd

On 1/19/23 12:32, Matthew Castrigno wrote:
> Can I assume my index/core is not corrupt to the extent that SOLR can
> detect such condition?
> Is this the best heath indicator?
>
> https://urldefense.com/v3/__https://solr.apache.org/guide/solr/latest/deployment-guide/ping.html*ping-api-examples__;Iw!!FkC3_z_N!LmxefpYhLsy8Qy0bSEuC8-rndzi-PlTTbrLNROpVKY9K7DegR0ziz9BbaTFRHldvw3NGj8ZEPUNOkeg$
>  
> 

How do you have your ping handler configured in solrconfig.xml?  What is
the exacy request you are sending to it?

Is Solr in cloud mode or standalone?

Thanks,
Shawn


--
"This message is intended for the use of the person or entity to which it is 
addressed and may contain information that is confidential or privileged, the 
disclosure of which is governed by applicable law. If the reader of this 
message is not the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this information is strictly 
prohibited. If you have received this message by error, please notify us 
immediately and destroy the related message."


Re: What does "OK" mean in the status of the response of the ping handler?

2023-01-20 Thread Shawn Heisey

On 1/20/23 12:14, Matthew Castrigno wrote:


Thank you Shawn.

My configuration is standalone mode.

We are going to use HAproxy  and need a metric to determine when to 
switch between two servers. The severs are kept "in sync" with each 
other by copying the cores from one to another. This works for our use 
case which only indexes new content on a schedule which we control.


I believe it also returns an HTTP response code that's an error type if 
the query fails.  Because your ping handler has a q parameter, I think 
you can reasonably rely on the OK status as a good check, and I *THINK* 
you can also rely on the http response code.


I like to use the healthcheck option, and I do know that it returns a 
503 response code if the healthcheck flag is not enabled.  This is a 
really good way to make it so that healthy servers are removed from 
rotation in case you want to force queries to go elsewhere.


Thanks,
Shawn