Hi Matt, We're currently running Solr 6.6.6 using Solr Cloud. Depending on the application and load, we've been able to stably run upwards of 1,000 collections without a problem in a single SolrCloud. We try to keep the total replica count per Solr instance to less than 500, but have run 600-700 replicas per Solr instance without issue if the user load is light. Our Solr document sizes are pretty large, but we're able to handle 80-90M docs per instance with 700-800G of total index size. 300B docs does seem quite large, but if the size of your docs aren't huge and you've got enough shards in your collection then I wouldn't be surprised if it worked fine. The only thing we learned is that we had to change the number of threads Solr uses for loading replicas because of our high numbers.... 8 threads would take forever upon startup (look at 'coreLoadThreads') . At the very least, perf test out something on a similar scale of what you're thinking and see how it scales. Best of Luck, Brian
On Mon, Jun 28, 2021 at 12:50 PM mtn search <search...@gmail.com> wrote: > I am guessing the consideration of hitting the limit of the number of > collections within a SolrCloud is not a common experience. I wanted to > raise this question again if perhaps anyone has any lessons learned or > things to consider. We are currently planning work to migrate 300 billion > plus docs on the master nodes of a legacy master/slave installation to > SolrCloud. I figure that we will push the limits of a single SolrCloud > instance. > > Thanks again, > Matt > > On Fri, Jun 25, 2021 at 10:15 AM mtn search <search...@gmail.com> wrote: > > > Hello, > > > > I am interested to learn what others have experienced in terms of hitting > > a limit for the number of collections supported by a SolrCloud instance. > > > > Also, does anyone have any tips/questions for evaluating when to create a > > new SolrCloud and begin adding new collections to it rather than grow the > > original SolrCloud instance? > > > > I realize there are likely a number of characteristics of a SolrCloud to > > evaluate. My guess is network resources will be the key factor. I am > > thinking of a SolrCloud with a 5, or 7 node Zookeeper ensemble. With > > Collections containing 10-30 million docs, small doc size, heavy > indexing, > > small query load. > > > > Thanks, > > Matt > > > -- *Brian Lininger* Technical Architect, Infrastructure & Search *Veeva Systems * brian.linin...@veeva.com *Zoom:* https://veeva.zoom.us/j/8113896271 www.veeva.com *This email and the information it contains are intended for the intended recipient only, are confidential and may be privileged information exempt from disclosure by law.* *If you have received this email in error, please notify us immediately by reply email and delete this message from your computer.* *Please do not retain, copy or distribute this email.*