SOLR caching Zookeeper IP addresses
Hi All, I have a head-scratcher at the moment with our SOLRCloud deployment. We have several SOLR docker images running on Alpine Linux, deployed as AWS ECS Fargate tasks We also have separate Zookeeper linux images running on AWS ECS Fargate tasks, as a quorum - i.e. 3 working together, orchestrating SOLR leader elections and holding SOLR configurations for collections SOLR tasks and Zookeeper tasks resolve their endpoint IP addresses via DNS (AWS Route53) As is the way with AWS, they oftentimes schedule forced redeployments of Fargate services when they update the underlying platforms for patching and security updates. We have a situation when the Zookeeper ECS tasks are redeployed, and they are given new IP addresses, SOLR appears to not realise this and doesn't seem to be querying DNS for the new addresses of Zookeeper. I have tested from within SOLR running ECS tasks that they can resolve Route53 names of other ECS services when they restart,,, which they do. So it doesn't appear to be an issue with the Alpine OS that the SOLR images are using to run the SOLR java app. I also tried adding the following JVM directives to the Jetty startup script that the SOLR tasks start with... -Dsun.net.inetaddr.ttl=0 -Dnetworkaddress.cache.ttl=0 I deployed this change, then after SOLR had come back up and was reporting all OK, redeployed the Zookeeper ECS tasks. SOLR behaved the same way, as if it was caching the IP addresses for ZK somwehere. We are running version 4.10.2 of SOLR, which we cannot easily upgrade from, as it would require significant refactoring or our core application that uses SOLR by our development team Does anybody know if there are any configuration options within SOLR itself, or where in the SOLR code it may be caching the ZK IPs? Any help would be much appreciated!! Cheers, Daz --
Re: SOLR caching Zookeeper IP addresses
We have hit similar issues in AWS as well, to work around it we had to add 'networkaddress.cache.ttl=60' to jre/lib/security/java.security. I don't think you want to completely eliminate DNS caching as it will impact performance, so setting it to a reasonable timeout should solve your problem while maintaining performance. On Thu, Apr 11, 2024 at 7:26 AM Darren Kukulka wrote: > Hi All, > > I have a head-scratcher at the moment with our SOLRCloud deployment. > > We have several SOLR docker images running on Alpine Linux, deployed as AWS > ECS Fargate tasks > > We also have separate Zookeeper linux images running on AWS ECS Fargate > tasks, as a quorum - i.e. 3 working together, orchestrating SOLR leader > elections and holding SOLR configurations for collections > > SOLR tasks and Zookeeper tasks resolve their endpoint IP addresses via DNS > (AWS Route53) > > As is the way with AWS, they oftentimes schedule forced redeployments of > Fargate services when they update the underlying platforms for patching and > security updates. > > We have a situation when the Zookeeper ECS tasks are redeployed, and they > are given new IP addresses, SOLR appears to not realise this and doesn't > seem to be querying DNS for the new addresses of Zookeeper. > > I have tested from within SOLR running ECS tasks that they can resolve > Route53 names of other ECS services when they restart,,, which they do. So > it doesn't appear to be an issue with the Alpine OS that the SOLR images > are using to run the SOLR java app. > > I also tried adding the following JVM directives to the Jetty startup > script that the SOLR tasks start with... > > -Dsun.net.inetaddr.ttl=0 > -Dnetworkaddress.cache.ttl=0 > > I deployed this change, then after SOLR had come back up and was reporting > all OK, redeployed the Zookeeper ECS tasks. SOLR behaved the same way, as > if it was caching the IP addresses for ZK somwehere. > > We are running version 4.10.2 of SOLR, which we cannot easily upgrade from, > as it would require significant refactoring or our core application that > uses SOLR by our development team > > Does anybody know if there are any configuration options within SOLR > itself, or where in the SOLR code it may be caching the ZK IPs? > > Any help would be much appreciated!! > > Cheers, > Daz > -- > -- *Brian Lininger* Technical Architect, Infrastructure & Search *Veeva Systems * brian.linin...@veeva.com *Zoom:* https://veeva.zoom.us/j/8113896271 www.veeva.com *This email and the information it contains are intended for the intended recipient only, are confidential and may be privileged information exempt from disclosure by law.* *If you have received this email in error, please notify us immediately by reply email and delete this message from your computer.* *Please do not retain, copy or distribute this email.*
Re: SOLR caching Zookeeper IP addresses
You haven't said what version of zookeeper you are running, but since the version of Solr you are using is 10 yeras old, i'm going to guess that the zookeeper version you are using are using is at least that old as well :) In which case you are probably encountering a well known (at the time) situation with zookeepe that the client never attempted to re-resolve the hostnames of the server when/if it needed to reconnect... https://news.ycombinator.com/item?id=11670088 https://issues.apache.org/jira/browse/ZOOKEEPER-2184 https://issues.apache.org/jira/browse/ZOOKEEPER-1506 : From: Darren Kukulka : Reply-To: users@solr.apache.org : To: users@solr.apache.org : Subject: SOLR caching Zookeeper IP addresses : : Hi All, : : I have a head-scratcher at the moment with our SOLRCloud deployment. : : We have several SOLR docker images running on Alpine Linux, deployed as AWS : ECS Fargate tasks : : We also have separate Zookeeper linux images running on AWS ECS Fargate : tasks, as a quorum - i.e. 3 working together, orchestrating SOLR leader : elections and holding SOLR configurations for collections : : SOLR tasks and Zookeeper tasks resolve their endpoint IP addresses via DNS : (AWS Route53) : : As is the way with AWS, they oftentimes schedule forced redeployments of : Fargate services when they update the underlying platforms for patching and : security updates. : : We have a situation when the Zookeeper ECS tasks are redeployed, and they : are given new IP addresses, SOLR appears to not realise this and doesn't : seem to be querying DNS for the new addresses of Zookeeper. : : I have tested from within SOLR running ECS tasks that they can resolve : Route53 names of other ECS services when they restart,,, which they do. So : it doesn't appear to be an issue with the Alpine OS that the SOLR images : are using to run the SOLR java app. : : I also tried adding the following JVM directives to the Jetty startup : script that the SOLR tasks start with... : : -Dsun.net.inetaddr.ttl=0 : -Dnetworkaddress.cache.ttl=0 : : I deployed this change, then after SOLR had come back up and was reporting : all OK, redeployed the Zookeeper ECS tasks. SOLR behaved the same way, as : if it was caching the IP addresses for ZK somwehere. : : We are running version 4.10.2 of SOLR, which we cannot easily upgrade from, : as it would require significant refactoring or our core application that : uses SOLR by our development team : : Does anybody know if there are any configuration options within SOLR : itself, or where in the SOLR code it may be caching the ZK IPs? : : Any help would be much appreciated!! : : Cheers, : Daz : -- : -Hoss http://www.lucidworks.com/