Is it possible for me to remove the node from the cluster i.e. to undo this mess and get the cluster operating again?
On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz <crdiaz...@gmail.com> wrote: > You can leave it in the seed list of the other nodes, just make sure it's > not included in this node's seed list. However, if you do decide to fix > the issue with the racks first assassinate this node (nodetool assassinate > <ip>), and update the rack name before you restart. > > On Sun, Apr 2, 2023 at 10:06 PM David Tinker <david.tin...@gmail.com> > wrote: > >> It is also in the seeds list for the other nodes. Should I remove it from >> those, restart them one at a time, then restart it? >> >> /etc/cassandra # grep -i bootstrap * >> doesn't show anything so I don't think I have auto_bootstrap false. >> >> Thanks very much for the help. >> >> >> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz <crdiaz...@gmail.com> wrote: >> >>> Just remove it from the seed list in the cassandra.yaml file and restart >>> the node. Make sure that auto_bootstrap is set to true first though. >>> >>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker <david.tin...@gmail.com> >>> wrote: >>> >>>> So likely because I made it a seed node when I added it to the cluster >>>> it didn't do the bootstrap process. How can I recover this? >>>> >>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker <david.tin...@gmail.com> >>>> wrote: >>>> >>>>> Yes replication factor is 3. >>>>> >>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am >>>>> still having issues getting data back from queries. >>>>> >>>>> I did make the new node a seed node. >>>>> >>>>> Re "rack4": I assumed that was just an indication as to the physical >>>>> location of the server for redundancy. This one is separate from the >>>>> others >>>>> so I used rack4. >>>>> >>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz <crdiaz...@gmail.com> >>>>> wrote: >>>>> >>>>>> I'm assuming that your replication factor is 3. If that's the case, >>>>>> did you intentionally put this node in rack 4? Typically, you want to >>>>>> add >>>>>> nodes in multiples of your replication factor in order to keep the >>>>>> "racks" >>>>>> balanced. In other words, this node should have been added to rack 1, 2 >>>>>> or >>>>>> 3. >>>>>> >>>>>> Having said that, you should be able to easily fix your problem by >>>>>> running a nodetool repair -pr on the new node. >>>>>> >>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker <david.tin...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi All >>>>>>> >>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and now >>>>>>> many reads are not returning rows! What do I need to do to fix this? >>>>>>> There >>>>>>> weren't any errors in the logs or other problems that I could see. I >>>>>>> expected the cluster to balance itself but this hasn't happened (yet?). >>>>>>> The >>>>>>> nodes are similar so I have num_tokens=256 for each. I am using the >>>>>>> Murmur3Partitioner. >>>>>>> >>>>>>> # nodetool status >>>>>>> Datacenter: dc1 >>>>>>> =============== >>>>>>> Status=Up/Down >>>>>>> |/ State=Normal/Leaving/Joining/Moving >>>>>>> -- Address Load Tokens Owns (effective) Host ID >>>>>>> Rack >>>>>>> UN xxx.xxx.xxx.105 2.65 TiB 256 72.9% >>>>>>> afd02287-3f88-4c6f-8b27-06f7a8192402 rack3 >>>>>>> UN xxx.xxx.xxx.253 2.6 TiB 256 73.9% >>>>>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a rack2 >>>>>>> UN xxx.xxx.xxx.24 93.82 KiB 256 80.0% >>>>>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500 rack4 >>>>>>> UN xxx.xxx.xxx.107 2.65 TiB 256 73.2% >>>>>>> ab72f017-be96-41d2-9bef-a551dec2c7b5 rack1 >>>>>>> >>>>>>> # nodetool netstats >>>>>>> Mode: NORMAL >>>>>>> Not sending any streams. >>>>>>> Read Repair Statistics: >>>>>>> Attempted: 0 >>>>>>> Mismatch (Blocking): 0 >>>>>>> Mismatch (Background): 0 >>>>>>> Pool Name Active Pending Completed >>>>>>> Dropped >>>>>>> Large messages n/a 0 71754 >>>>>>> 0 >>>>>>> Small messages n/a 0 8398184 >>>>>>> 14 >>>>>>> Gossip messages n/a 0 1303634 >>>>>>> 0 >>>>>>> >>>>>>> # nodetool ring >>>>>>> Datacenter: dc1 >>>>>>> ========== >>>>>>> Address Rack Status State Load >>>>>>> Owns Token >>>>>>> >>>>>>> 9189523899826545641 >>>>>>> xxx.xxx.xxx.24 rack4 Up Normal 93.82 KiB >>>>>>> 79.95% -9194674091837769168 >>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>> 73.25% -9168781258594813088 >>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>> 73.92% -9163037340977721917 >>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>> 72.88% -9148860739730046229 >>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>> 73.25% -9125240034139323535 >>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>> 73.92% -9112518853051755414 >>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>> 72.88% -9100516173422432134 >>>>>>> ... >>>>>>> >>>>>>> This is causing a serious production issue. Please help if you can. >>>>>>> >>>>>>> Thanks >>>>>>> David >>>>>>> >>>>>>> >>>>>>> >>>>>>>