It looks like you have replication factor of 3 and total data size of 1.43 GB per node. That's very small amount of data. Assuming the bottleneck is the network, not CPU or disk, and your 50 Mbps bandwidth is between each pair of servers across the two DCs (i.e. not the total bandwidth available between the DCs), the streaming process itself should only take minutes.

On 26/09/2022 12:14, Kaushal Shriyan wrote:



On Fri, Sep 23, 2022 at 8:39 PM Bowen Song via user <user@cassandra.apache.org> wrote:

    What's your definition of "sync"? Streaming all the existing data
    to the new DC? or the time lag between a write request is
    completed in one DC and the other DC?

    The former can be estimated based on a few facts about your setup
    (number of nodes, data size, etc.) and some measured data
    (streaming speed).

    The latter is usually just slightly above the network latency, but
    can spike up if and when the network between DCs suffer from
    temporary connectivity issues.


Hi Bowen,

Thanks for the quick response. I was referring to streaming all the existing data to the new DC(DC2). We have


    On 23/09/2022 15:58, Kaushal Shriyan wrote:
    Hi,

    Is there a way to measure cassandra nodes data sync time between
    DC1 and DC2? Currently DC1 is the prod datacenter. I am adding
    DC2 to the new data center by referring to
    https://docs.apigee.com/private-cloud/v4.51.00/adding-data-center?hl=en.

    https://docs.apigee.com/release/supported-software
    Cassandra version :- 2.1.22

    Is there a way to measure the time taken to sync the data in
    current prod DC1 (Cassandra Node 1, 2 ,3) and the new DC2
    (Cassandra Node 4, 5 ,6)?

    Thanks in advance.

    Best Regards,

    Kaushal

Hi Bowen,

Thanks for the quick response. Streaming all the existing data from the current prod DC1 (Cassandra Node 1, 2 ,3) to the new DC2 (Cassandra Node 4, 5 ,6). Data bandwidth between DC1 and DC2 is around 50 Mbps. Please let me know if you need any additional details. Thanks in advance.

/opt/apigee/apigee-cassandra/bin/nodetool status
Datacenter: dc-1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack UN  192.198.11.4    1.43 GB    1       100.0%  dbfbd44f-kec5-4f91-bc7d-c31582aec35a  ra-1 UN  192.198.11.128  1.43 GB    1       100.0%  bc55019c-8ccb-4403-9dc4-481b90a262f6  ra-1 UN  192.198.11.3    1.43 GB    1       100.0%  4402901c-4562-4f0f-b14a-4eed40a9836c  ra-1

_On Node1_
du -ch /opt/apigee/data/apigee-cassandra/data
1.7G total

_On Node2
_
du -ch /opt/apigee/data/apigee-cassandra/data

_On Node3
_
du -ch /opt/apigee/data/apigee-cassandra/data

Best Regards,

Kaushal

Reply via email to