Doug Rohrer created CASSANALYTICS-36: ----------------------------------------
Summary: Bulk Reader should dynamically size the Spark job based on estimated table size Key: CASSANALYTICS-36 URL: https://issues.apache.org/jira/browse/CASSANALYTICS-36 Project: Apache Cassandra Analytics Issue Type: New Feature Components: Reader Reporter: Doug Rohrer When reading a smaller dataset, leveraging a large number of Spark cores is actually less efficient than using a smaller number. By using estimated table size provided by Cassandra (similar to the data provided by `nodetool tablestats`) we can do a better job of limiting resource utilization and decreasing job runtime. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org