Hi Yabin/Alain, I changed the replication strategies for system_distributed, system_auth and system_traces to use NetworkTopologyStrategies and repaired the affected keyspaces. Now the rebuild process starts up ok without errors.
Thanks a lot for your help! Best regards, Timo On 22 September 2016 at 21:16, Yabin Meng <yabinm...@gmail.com> wrote: > It is a Cassandra bug. The workaround is to change system_distributed > keyspce replication strategy to something as below: > > alter keyspace system_distributed with replication = {'class': > 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'}; > > You may see similar problem for other system keyspaces. Do the same thing. > > Cheers, > > Yabin > > On Thu, Sep 22, 2016 at 1:44 PM, Timo Ahokas <timo.aho...@gmail.com> > wrote: > >> Hi Alain, >> >> Our normal user keyspaces have RF3 in all DCs, e.g: >> >> create keyspace reporting with replication = {'class': >> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'}; >> >> Any idea would it be safe to change the system_distributed keyspace to >> match this? >> >> -Timo >> >> On 22 September 2016 at 19:23, Timo Ahokas <timo.aho...@gmail.com> wrote: >> >>> Hi Alain, >>> >>> Thanks a lot for a helping out! >>> >>> Some of the basic keyspace / cluster info you requested: >>> >>> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh >>> >>> CREATE KEYSPACE system_distributed WITH replication = {'class': >>> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; >>> >>> CREATE TABLE system_distributed.repair_history ( >>> >>> keyspace_name text, >>> >>> columnfamily_name text, >>> >>> id timeuuid, >>> >>> coordinator inet, >>> >>> exception_message text, >>> >>> exception_stacktrace text, >>> >>> finished_at timestamp, >>> >>> parent_id timeuuid, >>> >>> participants set<inet>, >>> >>> range_begin text, >>> >>> range_end text, >>> >>> started_at timestamp, >>> >>> status text, >>> >>> PRIMARY KEY ((keyspace_name, columnfamily_name), id) >>> >>> ) WITH CLUSTERING ORDER BY (id ASC) >>> >>> AND bloom_filter_fp_chance = 0.01 >>> >>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >>> >>> AND comment = 'Repair history' >>> >>> AND compaction = {'class': 'org.apache.cassandra.db.compa >>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32', >>> 'min_threshold': '4'} >>> >>> AND compression = {'chunk_length_in_kb': '64', 'class': ' >>> org.apache.cassandra.io.compress.LZ4Compressor'} >>> >>> AND crc_check_chance = 1.0 >>> >>> AND dclocal_read_repair_chance = 0.0 >>> >>> AND default_time_to_live = 0 >>> >>> AND gc_grace_seconds = 0 >>> >>> AND max_index_interval = 2048 >>> >>> AND memtable_flush_period_in_ms = 3600000 >>> >>> AND min_index_interval = 128 >>> >>> AND read_repair_chance = 0.0 >>> >>> AND speculative_retry = '99PERCENTILE'; >>> >>> CREATE TABLE system_distributed.parent_repair_history ( >>> >>> parent_id timeuuid PRIMARY KEY, >>> >>> columnfamily_names set<text>, >>> >>> exception_message text, >>> >>> exception_stacktrace text, >>> >>> finished_at timestamp, >>> >>> keyspace_name text, >>> >>> requested_ranges set<text>, >>> >>> started_at timestamp, >>> >>> successful_ranges set<text> >>> >>> ) WITH bloom_filter_fp_chance = 0.01 >>> >>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >>> >>> AND comment = 'Repair history' >>> >>> AND compaction = {'class': 'org.apache.cassandra.db.compa >>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32', >>> 'min_threshold': '4'} >>> >>> AND compression = {'chunk_length_in_kb': '64', 'class': ' >>> org.apache.cassandra.io.compress.LZ4Compressor'} >>> >>> AND crc_check_chance = 1.0 >>> >>> AND dclocal_read_repair_chance = 0.0 >>> >>> AND default_time_to_live = 0 >>> >>> AND gc_grace_seconds = 0 >>> >>> AND max_index_interval = 2048 >>> >>> AND memtable_flush_period_in_ms = 3600000 >>> >>> AND min_index_interval = 128 >>> >>> AND read_repair_chance = 0.0 >>> >>> AND speculative_retry = '99PERCENTILE'; >>> >>> >>> CREATE TABLE system_distributed.repair_history ( >>> >>> keyspace_name text, >>> >>> columnfamily_name text, >>> >>> id timeuuid, >>> >>> coordinator inet, >>> >>> exception_message text, >>> >>> exception_stacktrace text, >>> >>> finished_at timestamp, >>> >>> parent_id timeuuid, >>> >>> participants set<inet>, >>> >>> range_begin text, >>> >>> range_end text, >>> >>> started_at timestamp, >>> >>> status text, >>> >>> PRIMARY KEY ((keyspace_name, columnfamily_name), id) >>> >>> ) WITH CLUSTERING ORDER BY (id ASC) >>> >>> AND bloom_filter_fp_chance = 0.01 >>> >>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >>> >>> AND comment = 'Repair history' >>> >>> AND compaction = {'class': 'org.apache.cassandra.db.compa >>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32', >>> 'min_threshold': '4'} >>> >>> AND compression = {'chunk_length_in_kb': '64', 'class': ' >>> org.apache.cassandra.io.compress.LZ4Compressor'} >>> >>> AND crc_check_chance = 1.0 >>> >>> AND dclocal_read_repair_chance = 0.0 >>> >>> AND default_time_to_live = 0 >>> >>> AND gc_grace_seconds = 0 >>> >>> AND max_index_interval = 2048 >>> >>> AND memtable_flush_period_in_ms = 3600000 >>> >>> AND min_index_interval = 128 >>> >>> AND read_repair_chance = 0.0 >>> >>> AND speculative_retry = '99PERCENTILE'; >>> >>> CREATE TABLE system_distributed.parent_repair_history ( >>> >>> parent_id timeuuid PRIMARY KEY, >>> >>> columnfamily_names set<text>, >>> >>> exception_message text, >>> >>> exception_stacktrace text, >>> >>> finished_at timestamp, >>> >>> keyspace_name text, >>> >>> requested_ranges set<text>, >>> >>> started_at timestamp, >>> >>> successful_ranges set<text> >>> >>> ) WITH bloom_filter_fp_chance = 0.01 >>> >>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >>> >>> AND comment = 'Repair history' >>> >>> AND compaction = {'class': 'org.apache.cassandra.db.compa >>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32', >>> 'min_threshold': '4'} >>> >>> AND compression = {'chunk_length_in_kb': '64', 'class': ' >>> org.apache.cassandra.io.compress.LZ4Compressor'} >>> >>> AND crc_check_chance = 1.0 >>> >>> AND dclocal_read_repair_chance = 0.0 >>> >>> AND default_time_to_live = 0 >>> >>> AND gc_grace_seconds = 0 >>> >>> AND max_index_interval = 2048 >>> >>> AND memtable_flush_period_in_ms = 3600000 >>> >>> AND min_index_interval = 128 >>> >>> AND read_repair_chance = 0.0 >>> >>> AND speculative_retry = '99PERCENTILE'; >>> >>> >>> >>> # nodetool status >>> >>> Datacenter: DC1 >>> >>> =============== >>> >>> Status=Up/Down >>> >>> |/ State=Normal/Leaving/Joining/Moving >>> >>> -- Address Load Tokens Owns Host ID >>> Rack >>> >>> UN xxx.xxx.145.5 693,63 GB 256 ? >>> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1 >>> >>> UN xxx.xxx.145.225 648,55 GB 256 ? >>> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1 >>> >>> UN xxx.xxx.145.160 608,31 GB 256 ? >>> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1 >>> >>> UN xxx.xxx.145.67 552,93 GB 256 ? >>> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1 >>> >>> UN xxx.xxx.145.227 636,68 GB 256 ? >>> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1 >>> >>> UN xxx.xxx.146.105 610,9 GB 256 ? >>> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1 >>> >>> UN xxx.xxx.147.136 666,82 GB 256 ? >>> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1 >>> >>> UN xxx.xxx.146.213 609,79 GB 256 ? >>> 6416275c-7570-48a9-957f-2daca71d31aa RAC1 >>> >>> UN xxx.xxx.146.20 664,44 GB 256 ? >>> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1 >>> >>> UN xxx.xxx.146.209 615,44 GB 256 ? >>> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1 >>> >>> UN xxx.xxx.146.241 668,91 GB 256 ? >>> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1 >>> >>> UN xxx.xxx.147.211 641,33 GB 256 ? >>> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1 >>> >>> UN xxx.xxx.147.125 647,03 GB 256 ? >>> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1 >>> >>> Datacenter: DC2 >>> >>> =============== >>> >>> Status=Up/Down >>> >>> |/ State=Normal/Leaving/Joining/Moving >>> >>> -- Address Load Tokens Owns Host ID >>> Rack >>> >>> UN xxx.xxx.7.99 18,76 MB 256 ? >>> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1 >>> >>> UN xxx.xxx.6.135 16,04 MB 256 ? >>> 463f480a-baf3-4230-86b7-1106251ebfad RAC1 >>> >>> UN xxx.xxx.7.229 17,36 MB 256 ? >>> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1 >>> >>> UN xxx.xxx.7.5 14,01 MB 256 ? >>> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1 >>> >>> UN xxx.xxx.7.4 14,93 MB 256 ? >>> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1 >>> >>> UN xxx.xxx.6.10 16,77 MB 256 ? >>> bbb66068-bf06-438d-81ee-965e201e8fff RAC1 >>> >>> UN xxx.xxx.6.15 14,95 MB 256 ? >>> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1 >>> >>> UN xxx.xxx.7.140 17,38 MB 256 ? >>> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1 >>> >>> UN xxx.xxx.7.113 19,14 MB 256 ? >>> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1 >>> >>> UN xxx.xxx.6.118 16,7 MB 256 ? >>> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1 >>> >>> UN xxx.xxx.6.248 17,29 MB 256 ? >>> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1 >>> >>> UN xxx.xxx.5.24 16,55 MB 256 ? >>> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1 >>> >>> UN xxx.xxx.7.189 16,63 MB 256 ? >>> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1 >>> >>> UN xxx.xxx.5.124 20,37 MB 256 ? >>> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1 >>> >>> UN xxx.xxx.6.60 24,57 MB 256 ? >>> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1 >>> >>> Datacenter: DC3 >>> >>> =============== >>> >>> Status=Up/Down >>> >>> |/ State=Normal/Leaving/Joining/Moving >>> >>> -- Address Load Tokens Owns Host ID >>> Rack >>> >>> UN xxx.xxx.151.102 389,41 GB 256 ? >>> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1 >>> >>> UN xxx.xxx.149.161 367,82 GB 256 ? >>> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1 >>> >>> UN xxx.xxx.149.226 390,88 GB 256 ? >>> b8ca4576-2632-4198-ac87-10243c0c554e RAC1 >>> >>> UN xxx.xxx.151.162 408,35 GB 256 ? >>> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1 >>> >>> UN xxx.xxx.149.109 369,33 GB 256 ? >>> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1 >>> >>> UN xxx.xxx.150.172 362,32 GB 256 ? >>> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1 >>> >>> UN xxx.xxx.149.238 388,98 GB 256 ? >>> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1 >>> >>> UN xxx.xxx.151.232 435,31 GB 256 ? >>> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1 >>> >>> UN xxx.xxx.151.43 410,69 GB 256 ? >>> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1 >>> >>> UN xxx.xxx.151.139 407,47 GB 256 ? >>> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1 >>> >>> UN xxx.xxx.151.213 375,05 GB 256 ? >>> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1 >>> >>> UN xxx.xxx.149.177 401,91 GB 256 ? >>> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1 >>> >>> UN xxx.xxx.150.145 388,76 GB 256 ? >>> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1 >>> >>> UN xxx.xxx.149.48 385,43 GB 256 ? >>> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1 >>> >>> UN xxx.xxx.150.189 384,52 GB 256 ? >>> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1 >>> >>> UN xxx.xxx.151.220 357,56 GB 256 ? >>> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1 >>> >>> UN xxx.xxx.149.121 355,64 GB 256 ? >>> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1 >>> >>> UN xxx.xxx.151.218 416,57 GB 256 ? >>> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1 >>> >>> UN xxx.xxx.150.26 383,06 GB 256 ? >>> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1 >>> >>> Note: Non-system keyspaces don't have the same replication settings, >>> effective ownership information is meaningless >>> >>> >>> >>> # nodetool status system_distributed >>> >>> Datacenter: DC1 >>> >>> =============== >>> >>> Status=Up/Down >>> >>> |/ State=Normal/Leaving/Joining/Moving >>> >>> -- Address Load Tokens Owns (effective) Host ID >>> Rack >>> >>> UN xxx.xxx.145.5 693,63 GB 256 6,2% >>> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1 >>> >>> UN xxx.xxx.145.225 648,55 GB 256 6,8% >>> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1 >>> >>> UN xxx.xxx.145.160 608,31 GB 256 6,5% >>> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1 >>> >>> UN xxx.xxx.145.67 552,93 GB 256 6,1% >>> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1 >>> >>> UN xxx.xxx.145.227 636,68 GB 256 6,0% >>> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1 >>> >>> UN xxx.xxx.146.105 610,9 GB 256 6,1% >>> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1 >>> >>> UN xxx.xxx.147.136 666,82 GB 256 6,3% >>> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1 >>> >>> UN xxx.xxx.146.213 609,79 GB 256 6,0% >>> 6416275c-7570-48a9-957f-2daca71d31aa RAC1 >>> >>> UN xxx.xxx.146.20 664,44 GB 256 7,0% >>> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1 >>> >>> UN xxx.xxx.146.209 615,44 GB 256 6,6% >>> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1 >>> >>> UN xxx.xxx.146.241 668,91 GB 256 6,2% >>> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1 >>> >>> UN xxx.xxx.147.211 641,33 GB 256 6,5% >>> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1 >>> >>> UN xxx.xxx.147.125 647,03 GB 256 6,3% >>> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1 >>> >>> Datacenter: DC2 >>> >>> =============== >>> >>> Status=Up/Down >>> >>> |/ State=Normal/Leaving/Joining/Moving >>> >>> -- Address Load Tokens Owns (effective) Host ID >>> Rack >>> >>> UN xxx.xxx.7.99 18,76 MB 256 6,3% >>> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1 >>> >>> UN xxx.xxx.6.135 16,04 MB 256 6,1% >>> 463f480a-baf3-4230-86b7-1106251ebfad RAC1 >>> >>> UN xxx.xxx.7.229 17,36 MB 256 5,9% >>> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1 >>> >>> UN xxx.xxx.7.5 14,01 MB 256 6,2% >>> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1 >>> >>> UN xxx.xxx.7.4 14,93 MB 256 6,4% >>> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1 >>> >>> UN xxx.xxx.6.10 16,77 MB 256 6,4% >>> bbb66068-bf06-438d-81ee-965e201e8fff RAC1 >>> >>> UN xxx.xxx.6.15 14,95 MB 256 6,1% >>> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1 >>> >>> UN xxx.xxx.7.140 17,38 MB 256 6,7% >>> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1 >>> >>> UN xxx.xxx.7.113 19,14 MB 256 6,8% >>> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1 >>> >>> UN xxx.xxx.6.118 16,7 MB 256 6,7% >>> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1 >>> >>> UN xxx.xxx.6.248 17,29 MB 256 6,9% >>> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1 >>> >>> UN xxx.xxx.5.24 16,55 MB 256 6,8% >>> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1 >>> >>> UN xxx.xxx.7.189 16,63 MB 256 6,2% >>> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1 >>> >>> UN xxx.xxx.5.124 20,37 MB 256 6,3% >>> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1 >>> >>> UN xxx.xxx.6.60 24,57 MB 256 6,4% >>> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1 >>> >>> Datacenter: DC3 >>> >>> =============== >>> >>> Status=Up/Down >>> >>> |/ State=Normal/Leaving/Joining/Moving >>> >>> -- Address Load Tokens Owns (effective) Host ID >>> Rack >>> >>> UN xxx.xxx.151.102 389,41 GB 256 6,4% >>> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1 >>> >>> UN xxx.xxx.149.161 367,82 GB 256 6,3% >>> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1 >>> >>> UN xxx.xxx.149.226 390,88 GB 256 6,2% >>> b8ca4576-2632-4198-ac87-10243c0c554e RAC1 >>> >>> UN xxx.xxx.151.162 408,35 GB 256 6,4% >>> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1 >>> >>> UN xxx.xxx.149.109 369,33 GB 256 6,2% >>> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1 >>> >>> UN xxx.xxx.150.172 362,32 GB 256 6,0% >>> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1 >>> >>> UN xxx.xxx.149.238 388,98 GB 256 6,4% >>> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1 >>> >>> UN xxx.xxx.151.232 435,31 GB 256 6,6% >>> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1 >>> >>> UN xxx.xxx.151.43 410,69 GB 256 6,2% >>> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1 >>> >>> UN xxx.xxx.151.139 407,47 GB 256 6,2% >>> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1 >>> >>> UN xxx.xxx.151.213 375,05 GB 256 6,5% >>> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1 >>> >>> UN xxx.xxx.149.177 401,91 GB 256 6,6% >>> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1 >>> >>> UN xxx.xxx.150.145 388,76 GB 256 7,1% >>> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1 >>> >>> UN xxx.xxx.149.48 385,43 GB 256 6,2% >>> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1 >>> >>> UN xxx.xxx.150.189 384,52 GB 256 6,4% >>> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1 >>> >>> UN xxx.xxx.151.220 357,56 GB 256 6,1% >>> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1 >>> >>> UN xxx.xxx.149.121 355,64 GB 256 6,4% >>> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1 >>> >>> UN xxx.xxx.151.218 416,57 GB 256 6,3% >>> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1 >>> UN xxx.xxx.150.26 383,06 GB 256 6,7% >>> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1 >>> >>> DC1 and DC3 are the old data centers. DC2 is the new one being added (as >>> seen from the data loads). >>> >>> For the snitch we are using GossipingPropertyFileSnitch and a >>> cassandra-rackdc.properties with config such as: >>> dc=DC1 >>> rack=RAC1 >>> >>> Just noticed that we also have cassandra-topology.properties present on >>> the nodes, but it's up-to-date with all the nodes from the 3 data centers. >>> >>> I was wondering on whether the replication settings for the >>> system_distributed keyspace might need a change, but didn't find any yet >>> documentation pointing to that. >>> >>> Best regards, >>> Timo >>> >>> On 22 September 2016 at 18:00, Alain RODRIGUEZ <arodr...@gmail.com> >>> wrote: >>> >>>> It could be a bug. >>>> >>>> Yet I am not very aware of this system_distributed keyspace, but from >>>> what I see, it is using a simple strategy: >>>> >>>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" | >>>> cqlsh $(hostname -I | awk '{print $1}') >>>> >>>> CREATE KEYSPACE system_distributed WITH replication = {'class': >>>> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; >>>> >>>> Let's first check some stuff. Could you share the output of: >>>> >>>> >>>> - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh >>>> [ip_address_of_the_server] >>>> - nodetool status >>>> - nodetool status system_distributed >>>> - Let us know about the snitch you are using and the corresponding >>>> configuration. >>>> >>>> >>>> I am trying to make sure the command you used is expected to work, >>>> given your setup. >>>> >>>> My guess is this you might need to alter this keyspace accordingly to >>>> your cluster setup. >>>> >>>> Just guessing, hope that helps. >>>> >>>> C*heers, >>>> ----------------------- >>>> Alain Rodriguez - @arodream - al...@thelastpickle.com >>>> France >>>> >>>> The Last Pickle - Apache Cassandra Consulting >>>> http://www.thelastpickle.com >>>> >>>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <timo.aho...@gmail.com>: >>>> >>>>> Hi, >>>>> >>>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15) >>>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We >>>>> are adding a third data center before decommissioning one of the earlier >>>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the >>>>> cluster (not set to bootstrap, as documented in >>>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati >>>>> ons/opsAddDCToCluster.html). >>>>> >>>>> When trying to rebuild nodes in the new DC from a previous DC >>>>> (nodetool rebuild -- DC1), we get the following error: >>>>> >>>>> Unable to find sufficient sources for streaming range >>>>> (597769692463489739,597931451954862346] in keyspace system_distributed >>>>> >>>>> The same error occurs which ever of the 2 existing DCs we try to >>>>> rebuild from. >>>>> >>>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via >>>>> cron. >>>>> >>>>> Any advice on how to get the rebuild started? >>>>> >>>>> Best regards, >>>>> Timo >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >