Re: Rebuild failing when adding new datacenter (3.0.8)

Timo Ahokas Thu, 22 Sep 2016 12:21:02 -0700

Hi Yabin/Alain,

I changed the replication strategies for system_distributed, system_auth
and system_traces to use NetworkTopologyStrategies and repaired the
affected keyspaces. Now the rebuild process starts up ok without errors.


Thanks a lot for your help!

Best regards,
Timo

On 22 September 2016 at 21:16, Yabin Meng <yabinm...@gmail.com> wrote:

> It is a Cassandra bug. The workaround is to change system_distributed
> keyspce replication strategy to something as below:
>
>       alter keyspace  system_distributed with replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>
> You may see similar problem for other system keyspaces. Do the same thing.
>
> Cheers,
>
> Yabin
>
> On Thu, Sep 22, 2016 at 1:44 PM, Timo Ahokas <timo.aho...@gmail.com>
> wrote:
>
>> Hi Alain,
>>
>> Our normal user keyspaces have RF3 in all DCs, e.g:
>>
>> create keyspace reporting with replication = {'class':
>> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>>
>> Any idea would it be safe to change the system_distributed keyspace to
>> match this?
>>
>> -Timo
>>
>> On 22 September 2016 at 19:23, Timo Ahokas <timo.aho...@gmail.com> wrote:
>>
>>> Hi Alain,
>>>
>>> Thanks a lot for a helping out!
>>>
>>> Some of the basic keyspace / cluster info you requested:
>>>
>>> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>>
>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>>>
>>> CREATE TABLE system_distributed.repair_history (
>>>
>>>    keyspace_name text,
>>>
>>>    columnfamily_name text,
>>>
>>>    id timeuuid,
>>>
>>>    coordinator inet,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    parent_id timeuuid,
>>>
>>>    participants set<inet>,
>>>
>>>    range_begin text,
>>>
>>>    range_end text,
>>>
>>>    started_at timestamp,
>>>
>>>    status text,
>>>
>>>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>>
>>> ) WITH CLUSTERING ORDER BY (id ASC)
>>>
>>>    AND bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>> CREATE TABLE system_distributed.parent_repair_history (
>>>
>>>    parent_id timeuuid PRIMARY KEY,
>>>
>>>    columnfamily_names set<text>,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    keyspace_name text,
>>>
>>>    requested_ranges set<text>,
>>>
>>>    started_at timestamp,
>>>
>>>    successful_ranges set<text>
>>>
>>> ) WITH bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>>
>>> CREATE TABLE system_distributed.repair_history (
>>>
>>>    keyspace_name text,
>>>
>>>    columnfamily_name text,
>>>
>>>    id timeuuid,
>>>
>>>    coordinator inet,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    parent_id timeuuid,
>>>
>>>    participants set<inet>,
>>>
>>>    range_begin text,
>>>
>>>    range_end text,
>>>
>>>    started_at timestamp,
>>>
>>>    status text,
>>>
>>>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>>
>>> ) WITH CLUSTERING ORDER BY (id ASC)
>>>
>>>    AND bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>> CREATE TABLE system_distributed.parent_repair_history (
>>>
>>>    parent_id timeuuid PRIMARY KEY,
>>>
>>>    columnfamily_names set<text>,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    keyspace_name text,
>>>
>>>    requested_ranges set<text>,
>>>
>>>    started_at timestamp,
>>>
>>>    successful_ranges set<text>
>>>
>>> ) WITH bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>>
>>>
>>> # nodetool status
>>>
>>> Datacenter: DC1
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns    Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.145.5    693,63 GB  256          ?
>>>       6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>>>
>>> UN  xxx.xxx.145.225  648,55 GB  256          ?
>>>       f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>>>
>>> UN  xxx.xxx.145.160  608,31 GB  256          ?
>>>       d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>>>
>>> UN  xxx.xxx.145.67   552,93 GB  256          ?
>>>       1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>>>
>>> UN  xxx.xxx.145.227  636,68 GB  256          ?
>>>       47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>>>
>>> UN  xxx.xxx.146.105  610,9 GB   256          ?
>>>       8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>>>
>>> UN  xxx.xxx.147.136  666,82 GB  256          ?
>>>       bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>>>
>>> UN  xxx.xxx.146.213  609,79 GB  256          ?
>>>       6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>>>
>>> UN  xxx.xxx.146.20   664,44 GB  256          ?
>>>       b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>>>
>>> UN  xxx.xxx.146.209  615,44 GB  256          ?
>>>       898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>>>
>>> UN  xxx.xxx.146.241  668,91 GB  256          ?
>>>       0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>>>
>>> UN  xxx.xxx.147.211  641,33 GB  256          ?
>>>       16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>>>
>>> UN  xxx.xxx.147.125  647,03 GB  256          ?
>>>       2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>>>
>>> Datacenter: DC2
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns    Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.7.99     18,76 MB   256          ?
>>>       d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>>>
>>> UN  xxx.xxx.6.135    16,04 MB   256          ?
>>>       463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>>>
>>> UN  xxx.xxx.7.229    17,36 MB   256          ?
>>>       9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>>>
>>> UN  xxx.xxx.7.5      14,01 MB   256          ?
>>>       ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>>>
>>> UN  xxx.xxx.7.4      14,93 MB   256          ?
>>>       122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>>>
>>> UN  xxx.xxx.6.10     16,77 MB   256          ?
>>>       bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>>>
>>> UN  xxx.xxx.6.15     14,95 MB   256          ?
>>>       668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>>>
>>> UN  xxx.xxx.7.140    17,38 MB   256          ?
>>>       7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>>>
>>> UN  xxx.xxx.7.113    19,14 MB   256          ?
>>>       46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>>>
>>> UN  xxx.xxx.6.118    16,7 MB    256          ?
>>>       9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>>>
>>> UN  xxx.xxx.6.248    17,29 MB   256          ?
>>>       35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>>>
>>> UN  xxx.xxx.5.24     16,55 MB   256          ?
>>>       5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>>>
>>> UN  xxx.xxx.7.189    16,63 MB   256          ?
>>>       be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>>>
>>> UN  xxx.xxx.5.124    20,37 MB   256          ?
>>>       638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>>>
>>> UN  xxx.xxx.6.60     24,57 MB   256          ?
>>>       cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>>>
>>> Datacenter: DC3
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns    Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.151.102  389,41 GB  256          ?
>>>       1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>>>
>>> UN  xxx.xxx.149.161  367,82 GB  256          ?
>>>       3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>>>
>>> UN  xxx.xxx.149.226  390,88 GB  256          ?
>>>       b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>>>
>>> UN  xxx.xxx.151.162  408,35 GB  256          ?
>>>       54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>>>
>>> UN  xxx.xxx.149.109  369,33 GB  256          ?
>>>       9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>>>
>>> UN  xxx.xxx.150.172  362,32 GB  256          ?
>>>       ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>>>
>>> UN  xxx.xxx.149.238  388,98 GB  256          ?
>>>       a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>>>
>>> UN  xxx.xxx.151.232  435,31 GB  256          ?
>>>       500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>>>
>>> UN  xxx.xxx.151.43   410,69 GB  256          ?
>>>       b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>>>
>>> UN  xxx.xxx.151.139  407,47 GB  256          ?
>>>       ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>>>
>>> UN  xxx.xxx.151.213  375,05 GB  256          ?
>>>       9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>>>
>>> UN  xxx.xxx.149.177  401,91 GB  256          ?
>>>       b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>>>
>>> UN  xxx.xxx.150.145  388,76 GB  256          ?
>>>       1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>>>
>>> UN  xxx.xxx.149.48   385,43 GB  256          ?
>>>       ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>>>
>>> UN  xxx.xxx.150.189  384,52 GB  256          ?
>>>       f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>>>
>>> UN  xxx.xxx.151.220  357,56 GB  256          ?
>>>       feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>>>
>>> UN  xxx.xxx.149.121  355,64 GB  256          ?
>>>       47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>>>
>>> UN  xxx.xxx.151.218  416,57 GB  256          ?
>>>       bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
>>>
>>> UN  xxx.xxx.150.26   383,06 GB  256          ?
>>>       1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>>>
>>> Note: Non-system keyspaces don't have the same replication settings,
>>> effective ownership information is meaningless
>>>
>>>
>>>
>>> # nodetool status system_distributed
>>>
>>> Datacenter: DC1
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.145.5    693,63 GB  256          6,2%
>>>              6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>>>
>>> UN  xxx.xxx.145.225  648,55 GB  256          6,8%
>>>              f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>>>
>>> UN  xxx.xxx.145.160  608,31 GB  256          6,5%
>>>              d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>>>
>>> UN  xxx.xxx.145.67   552,93 GB  256          6,1%
>>>              1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>>>
>>> UN  xxx.xxx.145.227  636,68 GB  256          6,0%
>>>              47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>>>
>>> UN  xxx.xxx.146.105  610,9 GB   256          6,1%
>>>              8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>>>
>>> UN  xxx.xxx.147.136  666,82 GB  256          6,3%
>>>              bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>>>
>>> UN  xxx.xxx.146.213  609,79 GB  256          6,0%
>>>              6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>>>
>>> UN  xxx.xxx.146.20   664,44 GB  256          7,0%
>>>              b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>>>
>>> UN  xxx.xxx.146.209  615,44 GB  256          6,6%
>>>              898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>>>
>>> UN  xxx.xxx.146.241  668,91 GB  256          6,2%
>>>              0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>>>
>>> UN  xxx.xxx.147.211  641,33 GB  256          6,5%
>>>              16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>>>
>>> UN  xxx.xxx.147.125  647,03 GB  256          6,3%
>>>              2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>>>
>>> Datacenter: DC2
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.7.99     18,76 MB   256          6,3%
>>>              d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>>>
>>> UN  xxx.xxx.6.135    16,04 MB   256          6,1%
>>>              463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>>>
>>> UN  xxx.xxx.7.229    17,36 MB   256          5,9%
>>>              9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>>>
>>> UN  xxx.xxx.7.5      14,01 MB   256          6,2%
>>>              ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>>>
>>> UN  xxx.xxx.7.4      14,93 MB   256          6,4%
>>>              122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>>>
>>> UN  xxx.xxx.6.10     16,77 MB   256          6,4%
>>>              bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>>>
>>> UN  xxx.xxx.6.15     14,95 MB   256          6,1%
>>>              668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>>>
>>> UN  xxx.xxx.7.140    17,38 MB   256          6,7%
>>>              7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>>>
>>> UN  xxx.xxx.7.113    19,14 MB   256          6,8%
>>>              46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>>>
>>> UN  xxx.xxx.6.118    16,7 MB    256          6,7%
>>>              9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>>>
>>> UN  xxx.xxx.6.248    17,29 MB   256          6,9%
>>>              35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>>>
>>> UN  xxx.xxx.5.24     16,55 MB   256          6,8%
>>>              5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>>>
>>> UN  xxx.xxx.7.189    16,63 MB   256          6,2%
>>>              be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>>>
>>> UN  xxx.xxx.5.124    20,37 MB   256          6,3%
>>>              638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>>>
>>> UN  xxx.xxx.6.60     24,57 MB   256          6,4%
>>>              cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>>>
>>> Datacenter: DC3
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.151.102  389,41 GB  256          6,4%
>>>              1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>>>
>>> UN  xxx.xxx.149.161  367,82 GB  256          6,3%
>>>              3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>>>
>>> UN  xxx.xxx.149.226  390,88 GB  256          6,2%
>>>              b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>>>
>>> UN  xxx.xxx.151.162  408,35 GB  256          6,4%
>>>              54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>>>
>>> UN  xxx.xxx.149.109  369,33 GB  256          6,2%
>>>              9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>>>
>>> UN  xxx.xxx.150.172  362,32 GB  256          6,0%
>>>              ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>>>
>>> UN  xxx.xxx.149.238  388,98 GB  256          6,4%
>>>              a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>>>
>>> UN  xxx.xxx.151.232  435,31 GB  256          6,6%
>>>              500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>>>
>>> UN  xxx.xxx.151.43   410,69 GB  256          6,2%
>>>              b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>>>
>>> UN  xxx.xxx.151.139  407,47 GB  256          6,2%
>>>              ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>>>
>>> UN  xxx.xxx.151.213  375,05 GB  256          6,5%
>>>              9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>>>
>>> UN  xxx.xxx.149.177  401,91 GB  256          6,6%
>>>              b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>>>
>>> UN  xxx.xxx.150.145  388,76 GB  256          7,1%
>>>              1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>>>
>>> UN  xxx.xxx.149.48   385,43 GB  256          6,2%
>>>              ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>>>
>>> UN  xxx.xxx.150.189  384,52 GB  256          6,4%
>>>              f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>>>
>>> UN  xxx.xxx.151.220  357,56 GB  256          6,1%
>>>              feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>>>
>>> UN  xxx.xxx.149.121  355,64 GB  256          6,4%
>>>              47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>>>
>>> UN  xxx.xxx.151.218  416,57 GB  256          6,3%
>>>              bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
>>> UN  xxx.xxx.150.26   383,06 GB  256          6,7%
>>>              1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>>>
>>> DC1 and DC3 are the old data centers. DC2 is the new one being added (as
>>> seen from the data loads).
>>>
>>> For the snitch we are using GossipingPropertyFileSnitch and a
>>> cassandra-rackdc.properties with config such as:
>>> dc=DC1
>>> rack=RAC1
>>>
>>> Just noticed that we also have cassandra-topology.properties present on
>>> the nodes, but it's up-to-date with all the nodes from the 3 data centers.
>>>
>>> I was wondering on whether the replication settings for the
>>> system_distributed keyspace might need a change, but didn't find any yet
>>> documentation pointing to that.
>>>
>>> Best regards,
>>> Timo
>>>
>>> On 22 September 2016 at 18:00, Alain RODRIGUEZ <arodr...@gmail.com>
>>> wrote:
>>>
>>>> It could be a bug.
>>>>
>>>> Yet I am not very aware of this system_distributed keyspace, but from
>>>> what I see, it is using a simple strategy:
>>>>
>>>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
>>>> cqlsh $(hostname -I | awk '{print $1}')
>>>>
>>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>>> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>>>>
>>>> Let's first check some stuff. Could you share the output of:
>>>>
>>>>
>>>>    - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>>>    [ip_address_of_the_server]
>>>>    - nodetool status
>>>>    - nodetool status system_distributed
>>>>    - Let us know about the snitch you are using and the corresponding
>>>>    configuration.
>>>>
>>>>
>>>> I am trying to make sure the command you used is expected to work,
>>>> given your setup.
>>>>
>>>> My guess is this you might need to alter this keyspace accordingly to
>>>> your cluster setup.
>>>>
>>>> Just guessing, hope that helps.
>>>>
>>>> C*heers,
>>>> -----------------------
>>>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>>>> France
>>>>
>>>> The Last Pickle - Apache Cassandra Consulting
>>>> http://www.thelastpickle.com
>>>>
>>>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <timo.aho...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>>>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>>>>> are adding a third data center before decommissioning one of the earlier
>>>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>>>>> cluster (not set to bootstrap, as documented in
>>>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>>>> ons/opsAddDCToCluster.html).
>>>>>
>>>>> When trying to rebuild nodes in the new DC from a previous DC
>>>>> (nodetool rebuild -- DC1), we get the following error:
>>>>>
>>>>> Unable to find sufficient sources for streaming range
>>>>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>>>>
>>>>> The same error occurs which ever of the 2 existing DCs we try to
>>>>> rebuild from.
>>>>>
>>>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>>>>> cron.
>>>>>
>>>>> Any advice on how to get the rebuild started?
>>>>>
>>>>> Best regards,
>>>>> Timo
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Rebuild failing when adding new datacenter (3.0.8)

Reply via email to