SOLR caching Zookeeper IP addresses

2024-04-11 Thread Darren Kukulka
Hi All,

I have a head-scratcher at the moment with our SOLRCloud deployment.

We have several SOLR docker images running on Alpine Linux, deployed as AWS
ECS Fargate tasks

We also have separate Zookeeper linux images running on AWS ECS Fargate
tasks, as a quorum - i.e. 3 working together, orchestrating SOLR leader
elections and holding SOLR configurations for collections

SOLR tasks and Zookeeper tasks resolve their endpoint IP addresses via DNS
(AWS Route53)

As is the way with AWS, they oftentimes schedule forced redeployments of
Fargate services when they update the underlying platforms for patching and
security updates.

We have a situation when the Zookeeper ECS tasks are redeployed, and they
are given new IP addresses, SOLR appears to not realise this and doesn't
seem to be querying DNS for the new addresses of Zookeeper.

I have tested from within SOLR running ECS tasks that they can resolve
Route53 names of other ECS services when they restart,,, which they do.  So
it doesn't appear to be an issue with the Alpine OS that the SOLR images
are using to run the SOLR java app.

I also tried adding the following JVM directives to the Jetty startup
script that the SOLR tasks start with...

-Dsun.net.inetaddr.ttl=0
-Dnetworkaddress.cache.ttl=0

I deployed this change, then after SOLR had come back up and was reporting
all OK, redeployed the Zookeeper ECS tasks.   SOLR behaved the same way, as
if it was caching the IP addresses for ZK somwehere.

We are running version 4.10.2 of SOLR, which we cannot easily upgrade from,
as it would require significant refactoring or our core application that
uses SOLR by our development team

Does anybody know if there are any configuration options within SOLR
itself, or where in the SOLR code it may be caching the ZK IPs?

Any help would be much appreciated!!

Cheers,
Daz
--


Re: SOLR caching Zookeeper IP addresses

2024-04-11 Thread Brian Lininger
We have hit similar issues in AWS as well, to work around it we had to add
'networkaddress.cache.ttl=60' to jre/lib/security/java.security.  I don't
think you want to completely eliminate DNS caching as it will impact
performance, so setting it to a reasonable timeout should solve your
problem while maintaining performance.

On Thu, Apr 11, 2024 at 7:26 AM Darren Kukulka
 wrote:

> Hi All,
>
> I have a head-scratcher at the moment with our SOLRCloud deployment.
>
> We have several SOLR docker images running on Alpine Linux, deployed as AWS
> ECS Fargate tasks
>
> We also have separate Zookeeper linux images running on AWS ECS Fargate
> tasks, as a quorum - i.e. 3 working together, orchestrating SOLR leader
> elections and holding SOLR configurations for collections
>
> SOLR tasks and Zookeeper tasks resolve their endpoint IP addresses via DNS
> (AWS Route53)
>
> As is the way with AWS, they oftentimes schedule forced redeployments of
> Fargate services when they update the underlying platforms for patching and
> security updates.
>
> We have a situation when the Zookeeper ECS tasks are redeployed, and they
> are given new IP addresses, SOLR appears to not realise this and doesn't
> seem to be querying DNS for the new addresses of Zookeeper.
>
> I have tested from within SOLR running ECS tasks that they can resolve
> Route53 names of other ECS services when they restart,,, which they do.  So
> it doesn't appear to be an issue with the Alpine OS that the SOLR images
> are using to run the SOLR java app.
>
> I also tried adding the following JVM directives to the Jetty startup
> script that the SOLR tasks start with...
>
> -Dsun.net.inetaddr.ttl=0
> -Dnetworkaddress.cache.ttl=0
>
> I deployed this change, then after SOLR had come back up and was reporting
> all OK, redeployed the Zookeeper ECS tasks.   SOLR behaved the same way, as
> if it was caching the IP addresses for ZK somwehere.
>
> We are running version 4.10.2 of SOLR, which we cannot easily upgrade from,
> as it would require significant refactoring or our core application that
> uses SOLR by our development team
>
> Does anybody know if there are any configuration options within SOLR
> itself, or where in the SOLR code it may be caching the ZK IPs?
>
> Any help would be much appreciated!!
>
> Cheers,
> Daz
> --
>


-- 


*Brian Lininger*
Technical Architect, Infrastructure & Search
*Veeva Systems *
brian.linin...@veeva.com

*Zoom:* https://veeva.zoom.us/j/8113896271

www.veeva.com


*This email and the information it contains are intended for the intended
recipient only, are confidential and may be privileged information exempt
from disclosure by law.*
*If you have received this email in error, please notify us immediately by
reply email and delete this message from your computer.*
*Please do not retain, copy or distribute this email.*


Re: SOLR caching Zookeeper IP addresses

2024-04-11 Thread Chris Hostetter


You haven't said what version of zookeeper you are running, but since the 
version of Solr you are using is 10 yeras old, i'm going to guess that the 
zookeeper version you are using are using is at least that old as well :)

In which case you are probably encountering a well known (at the time) 
situation with zookeepe that the client never attempted to re-resolve the 
hostnames of the server when/if it needed to reconnect...

https://news.ycombinator.com/item?id=11670088
https://issues.apache.org/jira/browse/ZOOKEEPER-2184
https://issues.apache.org/jira/browse/ZOOKEEPER-1506


: From: Darren Kukulka 
: Reply-To: users@solr.apache.org
: To: users@solr.apache.org
: Subject: SOLR caching Zookeeper IP addresses
: 
: Hi All,
: 
: I have a head-scratcher at the moment with our SOLRCloud deployment.
: 
: We have several SOLR docker images running on Alpine Linux, deployed as AWS
: ECS Fargate tasks
: 
: We also have separate Zookeeper linux images running on AWS ECS Fargate
: tasks, as a quorum - i.e. 3 working together, orchestrating SOLR leader
: elections and holding SOLR configurations for collections
: 
: SOLR tasks and Zookeeper tasks resolve their endpoint IP addresses via DNS
: (AWS Route53)
: 
: As is the way with AWS, they oftentimes schedule forced redeployments of
: Fargate services when they update the underlying platforms for patching and
: security updates.
: 
: We have a situation when the Zookeeper ECS tasks are redeployed, and they
: are given new IP addresses, SOLR appears to not realise this and doesn't
: seem to be querying DNS for the new addresses of Zookeeper.
: 
: I have tested from within SOLR running ECS tasks that they can resolve
: Route53 names of other ECS services when they restart,,, which they do.  So
: it doesn't appear to be an issue with the Alpine OS that the SOLR images
: are using to run the SOLR java app.
: 
: I also tried adding the following JVM directives to the Jetty startup
: script that the SOLR tasks start with...
: 
: -Dsun.net.inetaddr.ttl=0
: -Dnetworkaddress.cache.ttl=0
: 
: I deployed this change, then after SOLR had come back up and was reporting
: all OK, redeployed the Zookeeper ECS tasks.   SOLR behaved the same way, as
: if it was caching the IP addresses for ZK somwehere.
: 
: We are running version 4.10.2 of SOLR, which we cannot easily upgrade from,
: as it would require significant refactoring or our core application that
: uses SOLR by our development team
: 
: Does anybody know if there are any configuration options within SOLR
: itself, or where in the SOLR code it may be caching the ZK IPs?
: 
: Any help would be much appreciated!!
: 
: Cheers,
: Daz
: --
: 

-Hoss
http://www.lucidworks.com/