Hi Fd, I tried this on 3 nodes cluster. I killed node 2, both node1 and node3 reported node2 to be DN, then I killed node1 and node3 and I restarted them and node2 was reported like this:
[root@spark-master-1 /]# nodetool status Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack DN 172.19.0.8 ? 256 64.0% bd75a5e2-2890-44c5-8f7a-fca1b4ce94ab r1 Datacenter: dc1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.19.0.5 382.75 KiB 256 64.4% 2a062140-2428-4092-b48b-7495d083d7f9 rack1 UN 172.19.0.9 171.41 KiB 256 71.6% 9590b791-ad53-4b5a-b4c7-b00408ed02dd rack3 Prior to killing of node1 and node3, node2 was indeed marked as DN but it was part of the "Datacenter: dc1" output where both node1 and node3 were. But after killing both node1 and node3 (so cluster was totally down), after restarting them, node2 was reported like that. I do not know what is the difference here. Are gossiping data somewhere stored on the disk? I would say so, otherwise there is no way how could node1 / node3 report that node2 is down but at the same time I dont get why it is "out of the list" where node1 and node3 are. On Fri, 15 Mar 2019 at 02:42, Fd Habash <fmhab...@gmail.com> wrote: > I can conclusively say, none of these commands were run. However, I think > this is the likely scenario … > > > > If you have a cluster of three nodes 1,2,3 … > > - If 3 shows as DN > - Restart C* on 1 & 2 > - Nodetool status should NOT show node 3 IP at all. > > > > Restarting the cluster while a node is down resets gossip state. > > > > There is a good chance this is what happened. > > > > Plausible? > > > > ---------------- > Thank you > > > > *From: *Jeff Jirsa <jji...@gmail.com> > *Sent: *Thursday, March 14, 2019 11:06 AM > *To: *cassandra <user@cassandra.apache.org> > *Subject: *Re: Cannot replace_address /10.xx.xx.xx because it doesn't > exist ingossip > > > > Two things that wouldn't be a bug: > > > > You could have run removenode > > You could have run assassinate > > > > Also could be some new bug, but that's much less likely. > > > > > > On Thu, Mar 14, 2019 at 2:50 PM Fd Habash <fmhab...@gmail.com> wrote: > > I have a node which I know for certain was a cluster member last week. It > showed in nodetool status as DN. When I attempted to replace it today, I > got this message > > > > ERROR [main] 2019-03-14 14:40:49,208 CassandraDaemon.java:654 - Exception > encountered during startup > > java.lang.RuntimeException: Cannot replace_address /10.xx.xx.xxx.xx > because it doesn't exist in gossip > > at > org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:449) > ~[apache-cassandra-2.2.8.jar:2.2.8] > > > > > > DN 10.xx.xx.xx 388.43 KB 256 6.9% > bdbd632a-bf5d-44d4-b220-f17f258c4701 1e > > > > Under what conditions does this happen? > > > > > > ---------------- > Thank you > > > > > Stefan Miklosovic