Doug Rohrer created CASSANALYTICS-40:
----------------------------------------

             Summary: Bandwidth reduction (especially cross-dc writes)
                 Key: CASSANALYTICS-40
                 URL: https://issues.apache.org/jira/browse/CASSANALYTICS-40
             Project: Apache Cassandra Analytics
          Issue Type: Task
            Reporter: Doug Rohrer


Today, the bulk writer (in direct mode) sends data directly to every replica 
from each Spark task, which can use up a lot of WAN bandwidth on cross-DC 
Cassandra clusters. We should do something similar to the S3 path by sending 
the SSTable(s) to a single replica per DC and then distribute the data from 
that replica to other replicas inside the DC itself, or finding some other more 
efficient way to distribute the data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to