Re: Ranking of duplicate documents on solr

2024-07-30 Thread Deepak Goel
*Answer from Copilot:* Ah, the intricate dance of Solr shards and their cosmic collisions! Let’s unravel this like a digital detective, shall we? 🕵️‍♂️ When it comes to Solr and its distributed architecture, handling duplicate documents across shards can be as tricky as juggling flaming torches

Re: Solr Clients

2024-07-30 Thread Jan Høydahl
Hi, Most of us view all 9.x versions as production ready, as it's been around a long time. In 9.x both clients are supported, and you may choose depending on whether you prefer the apache-http client or jetty-http client. But in 10.0 we hope to use only http2 client internally. Bugs may have b

Re: Cancel ongoing solr backups

2024-07-30 Thread Yuntong Qu
Thank you Christine for looking into this. I am also not aware of any ways to cancel backups or async collections api calls. Agree that to implement this, it would make sense for something like CANCELASYNC to cancel ongoing async calls. I think adding cancelling async calls is a good idea to consi

Re: Collection Backup API

2024-07-30 Thread Kevin Liang (BLOOMBERG/ 919 3RD A)
Hi Matt, We have. We've done restore both to the existing cloud and to a different solr cloud instance. We haven't run into too many issues with restore, but keep in mind (like backups) the restore process is async and long running (many hours for clouds in the 100s of GB). Any additional resto

Re: Collection Backup API

2024-07-30 Thread mtn search
Thanks Kevin! Have you used the Restore API? If so, do you restore to the current SolrCloud or to a new SolrCloud instance (possibly for a DR scenario)? Any tips? Matt On Fri, Jul 26, 2024 at 2:42 PM Kevin Liang (BLOOMBERG/ 919 3RD A) < klian...@bloomberg.net> wrote: > We've been running regu

Significant Backup/Restore Performance Degradation for Large Collections

2024-07-30 Thread Hakan Özler
Hi!, We're experiencing performance issues in the recent Solr versions — 9.5.0 and 9.6.1 — regarding backup and restore. In 9.2.1, we could take a backup of 10TB data in just 1 and a half hours. Currently, as of 9.5.0, taking a backup of the collection takes 7 hours! We're unable to make use of di

Re: Cancel ongoing solr backups

2024-07-30 Thread Christine Poerschke (BLOOMBERG/ LONDON)
Hi Yuntong, Thank you for asking this question! I was curious to learn more about this area of the code base and so had a little look around. It seems there is no documented way yet to cancel a backup but conceptually perhaps the async request id could be used for cancel logic, if a CANCELBACK

Re: Solr Clients

2024-07-30 Thread Dario.Viva
Hi Jan Høydahl Indeed, you are correct. That is very relieving. So, what would be advice on version 9.0 - 9.5? Not using it in production, but using 9.6+? With kind regards, Dario Viva

Re: Unusually High Number of timeouts on 1 Solr Shard

2024-07-30 Thread Saksham Gupta
Aman, I am using a collection with implicit routing, as per solr 8.10 documentation, we can use split shard API only for hash based routing. Any help on how we can plan split shard activity for implicit routing. Is there a way to avoid creating a collection from scratch and taking it to production