Should I use assassinate or removenode? Given that there is some data on the node. Or will that be found on the other nodes? Sorry for all the questions but I really don't want to mess up.
On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz <crdiaz...@gmail.com> wrote: > That's what nodetool assassinte will do. > > On Sun, Apr 2, 2023 at 10:19 PM David Tinker <david.tin...@gmail.com> > wrote: > >> Is it possible for me to remove the node from the cluster i.e. to undo >> this mess and get the cluster operating again? >> >> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz <crdiaz...@gmail.com> wrote: >> >>> You can leave it in the seed list of the other nodes, just make sure >>> it's not included in this node's seed list. However, if you do decide to >>> fix the issue with the racks first assassinate this node (nodetool >>> assassinate <ip>), and update the rack name before you restart. >>> >>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker <david.tin...@gmail.com> >>> wrote: >>> >>>> It is also in the seeds list for the other nodes. Should I remove it >>>> from those, restart them one at a time, then restart it? >>>> >>>> /etc/cassandra # grep -i bootstrap * >>>> doesn't show anything so I don't think I have auto_bootstrap false. >>>> >>>> Thanks very much for the help. >>>> >>>> >>>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz <crdiaz...@gmail.com> wrote: >>>> >>>>> Just remove it from the seed list in the cassandra.yaml file and >>>>> restart the node. Make sure that auto_bootstrap is set to true first >>>>> though. >>>>> >>>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker <david.tin...@gmail.com> >>>>> wrote: >>>>> >>>>>> So likely because I made it a seed node when I added it to the >>>>>> cluster it didn't do the bootstrap process. How can I recover this? >>>>>> >>>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker <david.tin...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Yes replication factor is 3. >>>>>>> >>>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am >>>>>>> still having issues getting data back from queries. >>>>>>> >>>>>>> I did make the new node a seed node. >>>>>>> >>>>>>> Re "rack4": I assumed that was just an indication as to the physical >>>>>>> location of the server for redundancy. This one is separate from the >>>>>>> others >>>>>>> so I used rack4. >>>>>>> >>>>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz <crdiaz...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> I'm assuming that your replication factor is 3. If that's the >>>>>>>> case, did you intentionally put this node in rack 4? Typically, you >>>>>>>> want >>>>>>>> to add nodes in multiples of your replication factor in order to keep >>>>>>>> the >>>>>>>> "racks" balanced. In other words, this node should have been added to >>>>>>>> rack >>>>>>>> 1, 2 or 3. >>>>>>>> >>>>>>>> Having said that, you should be able to easily fix your problem by >>>>>>>> running a nodetool repair -pr on the new node. >>>>>>>> >>>>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker <david.tin...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi All >>>>>>>>> >>>>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and >>>>>>>>> now many reads are not returning rows! What do I need to do to fix >>>>>>>>> this? >>>>>>>>> There weren't any errors in the logs or other problems that I could >>>>>>>>> see. I >>>>>>>>> expected the cluster to balance itself but this hasn't happened >>>>>>>>> (yet?). The >>>>>>>>> nodes are similar so I have num_tokens=256 for each. I am using the >>>>>>>>> Murmur3Partitioner. >>>>>>>>> >>>>>>>>> # nodetool status >>>>>>>>> Datacenter: dc1 >>>>>>>>> =============== >>>>>>>>> Status=Up/Down >>>>>>>>> |/ State=Normal/Leaving/Joining/Moving >>>>>>>>> -- Address Load Tokens Owns (effective) Host ID >>>>>>>>> Rack >>>>>>>>> UN xxx.xxx.xxx.105 2.65 TiB 256 72.9% >>>>>>>>> afd02287-3f88-4c6f-8b27-06f7a8192402 rack3 >>>>>>>>> UN xxx.xxx.xxx.253 2.6 TiB 256 73.9% >>>>>>>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a rack2 >>>>>>>>> UN xxx.xxx.xxx.24 93.82 KiB 256 80.0% >>>>>>>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500 rack4 >>>>>>>>> UN xxx.xxx.xxx.107 2.65 TiB 256 73.2% >>>>>>>>> ab72f017-be96-41d2-9bef-a551dec2c7b5 rack1 >>>>>>>>> >>>>>>>>> # nodetool netstats >>>>>>>>> Mode: NORMAL >>>>>>>>> Not sending any streams. >>>>>>>>> Read Repair Statistics: >>>>>>>>> Attempted: 0 >>>>>>>>> Mismatch (Blocking): 0 >>>>>>>>> Mismatch (Background): 0 >>>>>>>>> Pool Name Active Pending Completed >>>>>>>>> Dropped >>>>>>>>> Large messages n/a 0 71754 >>>>>>>>> 0 >>>>>>>>> Small messages n/a 0 8398184 >>>>>>>>> 14 >>>>>>>>> Gossip messages n/a 0 1303634 >>>>>>>>> 0 >>>>>>>>> >>>>>>>>> # nodetool ring >>>>>>>>> Datacenter: dc1 >>>>>>>>> ========== >>>>>>>>> Address Rack Status State Load >>>>>>>>> Owns Token >>>>>>>>> >>>>>>>>> 9189523899826545641 >>>>>>>>> xxx.xxx.xxx.24 rack4 Up Normal 93.82 KiB >>>>>>>>> 79.95% -9194674091837769168 >>>>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>>>> 73.25% -9168781258594813088 >>>>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>>>> 73.92% -9163037340977721917 >>>>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>>>> 72.88% -9148860739730046229 >>>>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>>>> 73.25% -9125240034139323535 >>>>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>>>> 73.92% -9112518853051755414 >>>>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>>>> 72.88% -9100516173422432134 >>>>>>>>> ... >>>>>>>>> >>>>>>>>> This is causing a serious production issue. Please help if you can. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>