Doug Rohrer created CASSANALYTICS-40: ----------------------------------------
Summary: Bandwidth reduction (especially cross-dc writes) Key: CASSANALYTICS-40 URL: https://issues.apache.org/jira/browse/CASSANALYTICS-40 Project: Apache Cassandra Analytics Issue Type: Task Reporter: Doug Rohrer Today, the bulk writer (in direct mode) sends data directly to every replica from each Spark task, which can use up a lot of WAN bandwidth on cross-DC Cassandra clusters. We should do something similar to the S3 path by sending the SSTable(s) to a single replica per DC and then distribute the data from that replica to other replicas inside the DC itself, or finding some other more efficient way to distribute the data. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org