frankgh commented on code in PR #93: URL: https://github.com/apache/cassandra-analytics/pull/93#discussion_r2078271482
########## cassandra-analytics-integration-framework/src/main/java/org/apache/cassandra/distributed/impl/CassandraCluster.java: ########## @@ -101,20 +101,24 @@ public AbstractCluster<I> initializeCluster(String versionString, .withTokenCount(configuration.tokenCount) .withDataDirCount(configuration.numDataDirsPerInstance); + if (configuration.tokenCount > 1) { Review Comment: NIT style: ```suggestion if (configuration.tokenCount > 1) { ``` ########## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ########## @@ -339,11 +339,13 @@ else if (clearSnapshotStrategy.hasTTL()) private CompletionStage<Map<String, AvailabilityHint>> createSnapshot(ClientConfig options, RingResponse ring) { Map<String, PartitionedDataLayer.AvailabilityHint> availabilityHints = new ConcurrentHashMap<>(ring.size()); + Map<String, Boolean> distinctInstances = new HashMap<>(); // Fire off create snapshot request across the entire cluster List<CompletableFuture<Void>> futures = ring.stream() .filter(ringEntry -> datacenter == null || datacenter.equals(ringEntry.datacenter())) + .filter(ringEntry -> !options.createSnapshotFilterDistinctInstances() || distinctInstances.putIfAbsent(ringEntry.fqdn(), true) == null) Review Comment: NIT, we can alternatively use a HashSet ```suggestion .filter(ringEntry -> !options.createSnapshotFilterDistinctInstances() || distinctInstances.add(ringEntry.fqdn())) ``` ########## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ########## @@ -53,6 +53,11 @@ public class ClientConfig public static final String SNAPSHOT_NAME_KEY = "snapshotName"; public static final String DC_KEY = "dc"; public static final String CREATE_SNAPSHOT_KEY = "createSnapshot"; + /** + * Option to filter distinct instances before creating snapshots. This is only applicable when + * using vnodes where the token ring will contain multiple entries per instance. + */ + public static final String CREATE_SNAPSHOT_FILTER_DISTINCT_INSTANCES_KEY = "createSnapshotFilterDistinctInstances"; Review Comment: We could leverage the [NodeSettings#tokens](https://github.com/apache/cassandra-sidecar/blob/trunk/client-common/src/main/java/org/apache/cassandra/sidecar/common/response/NodeSettings.java#L48) to make the automatic determination. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org