My two cents, it took me some time to understand when to add shards or replicas when I first started using Solr,
Speed of a single isolated query when system is idle -VS- total throughput of the system when many queries are executing Sharding divides data into smaller pieces and puts each piece(shard) on a separate computer. Now for a single incoming query, all of those nodes needs to be searched because it has no idea which piece(shard) may contain the searched documents. If your data was very big before sharding, this improves response time of a single isolated query. If your data wasn’t big to start with, this *may* even hurt performance because more network trips. (Big = index size is larger than available RAM amount on the machine) Adding replicas creates a copy of your data and puts that copy on a separate computer. Now for a single incoming query, still, a single node is selected and queried. Single isolated query doesn’t get faster, but overall throughput is increased (system can serve 2x more queries per second before coughing). There are many “but if”s and different cases with this line of thought, but I hope it explains the main idea to someone new to Solr? A confusing thing for someone new is the relation between collection, replica and shard. A collection has one-to-many relationship with shard, and a shard has one-to-many relation with replica Hope this is useful Sent from Mail for Windows From: Ilan Ginzburg Sent: Thursday, August 3, 2023 2:55 PM To: users@solr.apache.org Subject: Re: Add a new Shard to the collection I don't think adding shards (even from 1 to 2) is the solution. You need enough replicas so all your nodes share the load, but with such small shards you likely don't need more than 1. If your nodes are saturated by traffic, you need more nodes (and more replicas so that the added nodes have a replica as well). Ilan On Thu, Aug 3, 2023, 8:23 AM HariBabu kuruva <hari2708.kur...@gmail.com> wrote: > Hi Ilan, > > Thank you for your reply. > > Application requests are facing connection failures a couple of times. So > our DEV team requested to add more shards as they are expecting more read > heavy queries in the future. > > Initially they requested two shards and now they are asking for one more > shard.(3 shards). We have a total of 6 solr nodes available. > > The disk sizes consumed by the currently created two shards are around 2.5 > GB each. > > Please let me know if any other information is required. > > > > > > On Wed, Aug 2, 2023 at 11:29 PM Ilan Ginzburg <ilans...@gmail.com> wrote: > > > Well, if the size of the two shards you now have is equivalent, you will > > not be able to get to 3 balanced (in size) shards. > > > > If one of the two seems to get more data (is larger), split that one. > This > > might be the case if you use fancy routing for deciding which doc goes > > where. > > > > Otherwise, to get to 3 similarly sized shards you need to explicitly > > specify the ranges during the split. > > Either create one subshard with twice the range of the other so you can > > split the larger one into two and end up with 3 similarly sized shards, > or > > split the initial shard into 3 subshards in one go (I've never tried > > splitting into more that 2 shards though, so I end up with a power of 2 > > number of balanced shards, assuming uniform distribution of docs into the > > hash range). > > > > But I assume your real goal is not having a specific number of shards. > > What issues are you running into in your current setup that you're trying > > to address? > > You mentioned "better performance" but performance of what? Query? > > Indexing? Are you running out of memory? CPU? Are you adding nodes > > (servers) and/or replicas as you're increasing the number of shards? > > > > What has improved as you moved from one to two shards? Why decide then > that > > you then want to have 3 shards and no stay at 2 or move to 4? > > > > Ilan > > > > On Wed, Aug 2, 2023, 5:48 PM HariBabu kuruva <hari2708.kur...@gmail.com> > > wrote: > > > > > Hi All, > > > > > > I did sharding, splitted shard1 into shard-1_0 and shard-1_1 > > > I want to have one more shard(3 shards). In this case, which shard > > should I > > > split . Please advise. > > > > > > > > > On Tue, Aug 1, 2023 at 11:17 AM HariBabu kuruva < > > hari2708.kur...@gmail.com > > > > > > > wrote: > > > > > > > ++ FYI, I can see the old shard automatically removed. > > > > > > > > On Mon, Jul 31, 2023 at 11:39 AM HariBabu kuruva < > > > > hari2708.kur...@gmail.com> wrote: > > > > > > > >> Thanks for your reply. > > > >> > > > >> I am a little bit worried about PROD. Can I go ahead and do the same > > > >> steps in PROD ? Do I need to take any backups or any steps before > > > >> doing this? > > > >> > > > >> On Sat, Jul 29, 2023 at 8:51 AM Mikhail Khludnev <m...@apache.org> > > > wrote: > > > >> > > > >>> Hello Hari. > > > >>> If new shards are handling queries and updates well it's ok to have > > old > > > >>> shard inactive. > > > >>> You can request DELETESHARD to reclaim the disk space. > > > >>> > > > >>> On Mon, Jul 24, 2023 at 6:19 PM HariBabu kuruva < > > > >>> hari2708.kur...@gmail.com> > > > >>> wrote: > > > >>> > > > >>> > Hi All, > > > >>> > > > > >>> > I would like to add a new shard to the existing collection to > have > > > >>> better > > > >>> > performance. Currently we have one shard. > > > >>> > > > > >>> > Solr - 8.11.1 > > > >>> > Nodes(servers) - 10 (Non prod - 4 nodes) > > > >>> > Zookeepers-5 > > > >>> > > > > >>> > I have tried the SPLITSHARD command in one of the non prod > > > >>> environments. > > > >>> > > > > >>> > * > > > >>> > > > > >>> > > > > > > https://solrserver.corp.company.com:8981/solr/admin/collections?action=SPLITSHARD&collection=abcStore&shard=shard1 > > > >>> > < > > > >>> > > > > >>> > > > > > > https://solrserver.corp.company.com:8981/solr/admin/collections?action=SPLITSHARD&collection=abcStore&shard=shard1 > > > >>> > >* > > > >>> > Now i can see total 3 shards > > > >>> > Shard1 > > > >>> > Shard1_0 > > > >>> > Shard1_1 > > > >>> > > > > >>> > But Shard1 is shown as inactive. Please let me know if we need to > > > >>> remove > > > >>> > this ? > > > >>> > > > > >>> > Please help me if this is the correct way of splitting the shard. > > > >>> > Are there any impacts to the data because of this ? > > > >>> > What are the measures to be taken while doing this in a PROD > > > >>> environment. > > > >>> > > > > >>> > -- > > > >>> > > > > >>> > Thanks and Regards, > > > >>> > Hari > > > >>> > Mobile:9790756568 > > > >>> > > > > >>> > > > >>> > > > >>> -- > > > >>> Sincerely yours > > > >>> Mikhail Khludnev > > > >>> > > > >> > > > >> > > > >> -- > > > >> > > > >> Thanks and Regards, > > > >> Hari > > > >> Mobile:9790756568 > > > >> > > > > > > > > > > > > -- > > > > > > > > Thanks and Regards, > > > > Hari > > > > Mobile:9790756568 > > > > > > > > > > > > > -- > > > > > > Thanks and Regards, > > > Hari > > > Mobile:9790756568 > > > > > > > > -- > > Thanks and Regards, > Hari > Mobile:9790756568 >