That sounds like something caused by duplicated node IDs (the Host ID column in `nodetool status`). Did you by any chance copied the Cassandra data directory between nodes? (e.g. spinning up a new node from a VM snapshot that contains a non-empty data directory)

On 03/06/2022 12:38, Marc Hoppins wrote:
Hi all,

Am new to Cassandra.  Just finished installing on 22 nodes across 2 datacentres.

If I run nodetool describecluster  I get

Stats for all nodes:
         Live: 22
         Joining: 0
         Moving: 0
         Leaving: 0
         Unreachable: 0

Data Centers:
         BA #Nodes: 9 #Down: 0
         DR1 #Nodes: 8 #Down: 0

There should be 12 in BA and 10 in DR1.  The service is running on these other 
nodes...yet nodetool status also only shows the above numbers.

Datacenter: BA
==============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load        Tokens  Owns (effective)  Host ID                 
              Rack
UN  10.1.146.197  304.72 KiB  16      11.4%             
26d5a89c-aa8f-4249-b2b5-82341cc214bc  SSW09
UN  10.1.146.186  245.02 KiB  16      9.0%              
29f20519-51f9-493c-b891-930762d82231  SSW09
UN  10.1.146.20   129.53 KiB  16      12.5%             
f90dd318-1357-46ca-9870-807d988658b3  SSW09
UN  10.1.146.200  150.31 KiB  16      11.1%             
c544e85a-c2c5-4afd-aca8-1854a1723c2f  SSW09
UN  10.1.146.17   185.9 KiB   16      11.7%             
db9d9856-3082-44a8-b292-156da1a17d0a  SSW09
UN  10.1.146.174  288.64 KiB  16      12.1%             
03126eba-8b58-4a96-80ca-10cec2e18e69  SSW09
UN  10.1.146.199  146.71 KiB  16      13.7%             
860d6549-94ab-4a07-b665-70ea7e53f41a  SSW09
UN  10.1.146.78   69.05 KiB   16      11.5%             
7d9fdbab-40b0-4a9e-b0c9-4ffa822c42fd  SSW09
UN  10.1.146.67   304.5 KiB   16      13.6%             
48e9eba2-9112-4d91-8f26-8272cb5ce7bc  SSW09

Datacenter: DR1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load        Tokens  Owns (effective)  Host ID                 
              Rack
UN  10.1.146.137  209.33 KiB  16      12.6%             
f65c685f-048c-41de-85e4-308c4b84d047  SSW02
UN  10.1.146.141  237.21 KiB  16      9.8%              
847ad921-fceb-4cef-acec-1c918d2a6517  SSW02
UN  10.1.146.131  311.05 KiB  16      11.7%             
7263f6c6-c4d6-438e-8ee7-d07666242ba0  SSW02
UN  10.1.146.139  283.33 KiB  16      11.5%             
264cbe47-acb4-49cc-97d0-6f9e2cee6844  SSW02
UN  10.1.146.140  258.46 KiB  16      11.6%             
43dbbe91-5dac-4c3a-9df5-2f5ccf268eb6  SSW02
UN  10.1.146.132  157.03 KiB  16      12.3%             
1c0cb23c-af78-4fa2-bd92-20fa7d39ec30  SSW02
UN  10.1.146.135  301.13 KiB  16      11.2%             
26159fbe-cf78-4c94-88e0-54773bcf7bed  SSW02
UN  10.1.146.130  305.16 KiB  16      12.5%             
d6d6c490-551d-4a97-a93c-3b772b750d7d  SSW02

So I restarted the service on one of the missing addresses. It appeared in the 
list but one other dropped off.  I tried this several times.  It seems I can 
only get 9 and 8 not 12 and 10.

Anyone have an idea why this may be so?

Thanks

Marc

Reply via email to