aweisberg commented on code in PR #62: URL: https://github.com/apache/cassandra-analytics/pull/62#discussion_r1672788863
########## cassandra-analytics-common/src/main/java/org/apache/cassandra/spark/stats/IStats.java: ########## @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.stats; + +import org.apache.cassandra.spark.utils.streaming.CassandraFile; +import org.apache.cassandra.spark.utils.streaming.CassandraFileSource; + +/** + * Generic Stats interface that works across all CassandraFile FileTypes. + * + * @param <FileType> + */ +public interface IStats<FileType extends CassandraFile> +{ + default void inputStreamEnd(CassandraFileSource<FileType> source, long runTimeNanos, long totalNanosBlocked) Review Comment: If these are all going to be empty they could be single line with no line breaks between. Not really that important. ########## cassandra-analytics-common/src/main/java/org/apache/cassandra/spark/reader/StreamScanner.java: ########## @@ -49,30 +49,31 @@ * <p> * Upon return from the next() call the current values of the scanner can be obtained by calling * the methods in Rid, getPartitionKey(), getColumnName(), getValue(). + * * @param <Type> type of object returned by rid() method. */ @SuppressWarnings("unused") public interface StreamScanner<Type> extends Closeable Review Comment: Why doesn't this implement iterator so it can be used with all the iterator related things? Not really in scope for this refactor just curious. ########## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/LocalDataLayer.java: ########## @@ -174,7 +174,7 @@ public static LocalDataLayer from(Map<String, String> options) .filter(StringUtils::isNotEmpty) .collect(Collectors.toSet()), SchemaFeatureSet.initializeFromOptions(options), - getBoolean(options, lowerCaseKey("useSSTableInputStream"), false), + getBoolean(options, lowerCaseKey("useBufferingInputStream"), false), Review Comment: Is anything going to break because the name of this option changed? Should there be compatibility with the old name? ########## cassandra-analytics-common/src/main/java/org/apache/cassandra/spark/utils/ByteBufferUtils.java: ########## @@ -222,4 +212,80 @@ public static void skipFully(InputStream is, long length) throws IOException throw new EOFException("EOF after " + skipped + " bytes out of " + length); } } + + // Extract component position from buffer; return null if there are not enough components + public static ByteBuffer extractComponent(ByteBuffer buffer, int position) Review Comment: These methods look like they are here to avoid depending on Spark. It would be helpful to comment where they originally came from. ########## cassandra-analytics-common/src/main/java/org/apache/cassandra/spark/utils/streaming/CassandraFile.java: ########## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.utils.streaming; + +public interface CassandraFile Review Comment: Are there really no common methods? All the usages I can see are in `common` so not sure who would use this interface? Seems like in practice the usage is always something concrete that extends `CassandraFile` but at that point the concrete class is doing all the heavy lifting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
