yifan-c commented on code in PR #93:
URL: 
https://github.com/apache/cassandra-analytics/pull/93#discussion_r2078084519


##########
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/testcontainer/BulkWriteS3CompatModeSimpleMultipleTokensTestImpl.java:
##########
@@ -0,0 +1,12 @@
+package org.apache.cassandra.analytics.testcontainer;
+
+import org.apache.cassandra.testing.ClusterBuilderConfiguration;
+
+public class BulkWriteS3CompatModeSimpleMultipleTokensTestImpl extends 
BulkWriteS3CompatModeSimpleTest {

Review Comment:
   maybe rename to `BulkWriteS3CompatModeSimpleMultipleTokensTest`? The ending 
`Impl` is probably added by mistake. 



##########
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/CassandraAnalyticsSimpleTest.java:
##########
@@ -19,29 +19,25 @@
 
 package org.apache.cassandra.analytics;
 
-import java.util.Arrays;
-import java.util.HashMap;
-import java.util.List;
-import java.util.Map;
-import java.util.function.Function;
-import java.util.stream.Stream;
-
-import org.junit.jupiter.api.Timeout;
-import org.junit.jupiter.params.ParameterizedTest;
-import org.junit.jupiter.params.provider.Arguments;
-import org.junit.jupiter.params.provider.MethodSource;
-
 import org.apache.cassandra.sidecar.testing.QualifiedName;
 import org.apache.cassandra.spark.bulkwriter.WriterOptions;
 import org.apache.cassandra.testing.ClusterBuilderConfiguration;
 import org.apache.spark.sql.Dataset;
 import org.apache.spark.sql.Row;
 import org.apache.spark.sql.SparkSession;
+import org.junit.jupiter.api.Timeout;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.Arguments;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.function.Function;
+import java.util.stream.Stream;
 
-import static 
org.apache.cassandra.testing.TestUtils.CREATE_TEST_TABLE_STATEMENT;
-import static org.apache.cassandra.testing.TestUtils.DC1_RF3;
-import static org.apache.cassandra.testing.TestUtils.ROW_COUNT;
-import static org.apache.cassandra.testing.TestUtils.TEST_KEYSPACE;
+import static org.apache.cassandra.testing.TestUtils.*;

Review Comment:
   codestyle nit: the project prefers explicit imports over wildcard import. 



##########
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/BulkWriteDownSidecarMultipleTokensTest.java:
##########
@@ -0,0 +1,12 @@
+package org.apache.cassandra.analytics;
+
+import org.apache.cassandra.testing.ClusterBuilderConfiguration;
+
+public class BulkWriteDownSidecarMultipleTokensTest extends 
BulkWriteDownSidecarTest {
+    @Override
+    protected ClusterBuilderConfiguration testClusterConfiguration()
+    {
+        return super.testClusterConfiguration()
+                .tokenCount(4);

Review Comment:
   nit: align with `.`
   
   ```suggestion
           return super.testClusterConfiguration()
                       .tokenCount(4);
   ```



##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##########
@@ -53,6 +53,11 @@ public class ClientConfig
     public static final String SNAPSHOT_NAME_KEY = "snapshotName";
     public static final String DC_KEY = "dc";
     public static final String CREATE_SNAPSHOT_KEY = "createSnapshot";
+    /**
+     * Option to filter distinct instances before creating snapshots. This is 
only applicable when
+     * using vnodes where the token ring will contain multiple entries per 
instance.
+     */
+    public static final String CREATE_SNAPSHOT_FILTER_DISTINCT_INSTANCES_KEY = 
"createSnapshotFilterDistinctInstances";

Review Comment:
   IMO, a flag is wanted, when (1) there is nothing in the system to derive a 
condition (that alters behavior) and it requires an external signal, or (2) 
explicitness is _required_. I think in this case, if we can reliably derive the 
condition from the ring response, it would void the need for the flag. Beside, 
it might cause confusion when cluster is running vnode but 
`createSnapshotFilterDistinctInstances` is set to `false`. I think the 
expectation is to only create snapshot once on each physical host always. If 
true, the flag is also unnecessary. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to