Re: Load on Solr Nodes due to High GC

2024-06-20 Thread Deepak Goel
Can you please tell me about the hardware details (Server type, CPU speed and type, Disk Speed and type) and GC configuration? Also please post results of top, iotop if you can? Deepak "The greatness of a nation can be judged by the way its animals are treated - Mahatma Gandhi" +91 73500 12833 d

Re: Load on Solr Nodes due to High GC

2024-06-20 Thread matthew sporleder
Are you having iowait, gc pauses, or something else? Do you commit often or in one big batch? > On Jun 20, 2024, at 12:26 AM, Saksham Gupta > wrote: > > Hi All, > > We have been facing extra load incidents due to higher gc count and gc time > causing higher response time and timeouts. > >

Zookeeper KeeperErrorCode = NodeExists

2024-06-20 Thread Sergio García Maroto
Hi All. I am facing a weird issue while upgrading Solr8.11 to Solr9. I have everyhting up and running passing all kind of tests unit and integration on my current CD process. I have a cluster of 3 machines on SolrCloud and it's all good and working. Problem happens when machines are restarted. Ei

Solr replication delays in IndexFetcher

2024-06-20 Thread Marcus Bergner
Hi, I'm using a traditional master/replica Solr (8.11) setup and I'm trying to tune Solr's autoCommitTimeout, autoSoftCommitTimeout on the Solr master and the pollInterval on the replicas to achieve an overall better indexing throughput while still maintaining an acceptably low indexing latency

are bots DoS'ing anyone else's Solr?

2024-06-20 Thread Dmitri Maziuk
Hi all, the latest mole in the eternal whack-a-mole game with web crawlers (GPTBot) DoS'ed our Solr again & I took a closer look at the logs. Here's what it looks like is happening: - the bot is hitting a URL backed by Solr search and starts following all permutations of facets and "next pag

Re: are bots DoS'ing anyone else's Solr?

2024-06-20 Thread matthew sporleder
solr allows you to go into page=1000 or whatever, bots will follow it, but there is rarely any business value for going so deep. You can come up with a scheme for cursormarks + caching (faster than paging) or just stop showing results past page 5-10. On Thu, Jun 20, 2024 at 11:39 AM Dmitri Maziuk

AW: are bots DoS'ing anyone else's Solr?

2024-06-20 Thread Ohms, Jannis
I Work in a library so yes we have a similar Problem our solr ist used inderect by a Webapplikationen running in another Server WE use https://wiki.archlinux.org/title/fail2ban to Block IPs which exceed a given number of requests per Minute Von: Dmitri Maziuk Ge

Re: are bots DoS'ing anyone else's Solr?

2024-06-20 Thread Imran Chaudhry
+1 for fail2ban @Dmitri Maziuk if your Solr is behind Apache httpd then you may be interested in mod-evasive which worked well for XMLRPC attacks against Wordpress. You can combo it with fail2ban https://ejectdisc.org/2015/08/08/admin-a-wordpress-site-running-on-debian-linux-learn-how-to-protect

Re: are bots DoS'ing anyone else's Solr?

2024-06-20 Thread Dmitri Maziuk
On 6/20/24 11:17, Imran Chaudhry wrote: ... If I were running on linux I'd have them blocked at iptbales-recent too... and if I were running on bare metal I'd put it on an SSD-cached ZVOL and likely not see Java choke on nio under load. But I am not. :( It sounds like your Solr is publically

Re: 150x+ performance hit when number of rows <= 50 in a simple query

2024-06-20 Thread Michael Gibney
I've been unable to reproduce anything like this behavior. If you're really getting queryResultCache hits for these, then the field type/etc of the field you're querying on shouldn't make a difference. type/etc of the return field (product_id) would be more likely to matter. I wonder what would hap

Re: How to bind embedded zookeeper to specific interface/ip?

2024-06-20 Thread Chris Hostetter
For some historic reasons, Solr has always explicitly overridden the `clientPortAddress` -- but as of a few versions ago, there is a Solr setting (SOLR_ZK_EMBEDDED_HOST) that can be used to override solr's override... https://solr.apache.org/guide/solr/latest/deployment-guide/taking-solr-to-p

Re: 150x+ performance hit when number of rows <= 50 in a simple query

2024-06-20 Thread Oleksandr Tkachuk
FYI: There is a solution in the last paragraph, but I still ran your tests, since the solution was found by "Cut and Try" and there is no deep understanding. >I wonder what would happen if you fully bypassed the query cache (i.e., >`q={!cache=false}product_type:"1"`? It does not help, there is n