xushiyan commented on a change in pull request #3413:
URL: https://github.com/apache/hudi/pull/3413#discussion_r717267707



##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
##########
@@ -314,6 +320,28 @@ public static void saveParquetToDFS(List<GenericRecord> 
records, Path targetFile
       }
     }
 
+    public static void saveORCToDFS(List<GenericRecord> records, Path 
targetFile) throws IOException {
+      TypeDescription orcSchema = 
AvroOrcUtils.createOrcSchema(HoodieTestDataGenerator.AVRO_SCHEMA);

Review comment:
       ditto

##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamerBase.java
##########
@@ -247,4 +254,27 @@ protected static void prepareParquetDFSFiles(int 
numRecords, String baseParquetP
           dataGenerator.generateInserts("000", numRecords)), new Path(path));
     }
   }
+
+  protected static void prepareORCDFSFiles(int numRecords) throws IOException {
+    prepareORCDFSFiles(numRecords, ORC_SOURCE_ROOT);
+  }
+
+  protected static void prepareORCDFSFiles(int numRecords, String baseORCPath) 
throws IOException {
+    prepareORCDFSFiles(numRecords, baseORCPath, FIRST_ORC_FILE_NAME, false, 
null, null);
+  }
+
+  protected static void prepareORCDFSFiles(int numRecords, String baseORCPath, 
String fileName, boolean useCustomSchema,
+                                               String schemaStr, Schema 
schema) throws IOException {
+    String path = baseORCPath + "/" + fileName;
+    HoodieTestDataGenerator dataGenerator = new HoodieTestDataGenerator();
+    if (useCustomSchema) {
+      Helpers.saveORCToDFS(Helpers.toGenericRecords(
+              dataGenerator.generateInsertsAsPerSchema("000", numRecords, 
schemaStr),
+              schema), new Path(path), 
AvroOrcUtils.createOrcSchema(HoodieTestDataGenerator.AVRO_TRIP_SCHEMA));

Review comment:
       better if add a `HoodieTestDataGenerator.ORC_TRIP_SCHEMA` in the class 
and convert this inside?

##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1622,6 +1651,16 @@ public void 
testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, 
Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @Test
+  public void testORCDFSSourceWithoutSchemaProviderAndNoTransformer() throws 
Exception {
+    testORCDFSSource(false, null);
+  }
+
+  @Test
+  public void testORCDFSSourceWithSchemaFilesAndTransformer() throws Exception 
{
+    testORCDFSSource(true, 
Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
+  }

Review comment:
       can we use `@ParameterizedTest` here? with `@MethodSource` returning 
`Stream<Arguments>` to make it cleaner
   

##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1936,4 +1972,12 @@ public Schema getTargetSchema() {
     }
   }
 
+  private static Stream<Arguments> testArguments() {

Review comment:
       `testArguments` is not specific enough. use the same method name then 
`@MethodSource` does not need to have its argument. it'll look for the method 
with the same name. so this argument provider method will only serve its own 
test case method
   
   ```suggestion
     private static Stream<Arguments> testORCDFSSource() {
   ```

##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -1622,6 +1652,12 @@ public void 
testParquetDFSSourceWithSchemaFilesAndTransformer() throws Exception
     testParquetDFSSource(true, 
Collections.singletonList(TripsWithDistanceTransformer.class.getName()));
   }
 
+  @ParameterizedTest
+  @MethodSource("testArguments")

Review comment:
       these 2 should be annotated on `testORCDFSSource()` method itself then 
this method not needed

##########
File path: 
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
##########
@@ -129,10 +129,12 @@
 
 
   public static final Schema AVRO_SCHEMA = new 
Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
+  public static final Schema ORC_SCHEMA = new 
Schema.Parser().parse(TRIP_EXAMPLE_SCHEMA);
   public static final Schema AVRO_SCHEMA_WITH_METADATA_FIELDS =
       HoodieAvroUtils.addMetadataFields(AVRO_SCHEMA);
   public static final Schema AVRO_SHORT_TRIP_SCHEMA = new 
Schema.Parser().parse(SHORT_TRIP_SCHEMA);
   public static final Schema AVRO_TRIP_SCHEMA = new 
Schema.Parser().parse(TRIP_SCHEMA);
+  public static final Schema ORC_TRIP_SCHEMA = new 
Schema.Parser().parse(TRIP_SCHEMA);

Review comment:
       actually what i meant is to do `AvroOrcUtils.createOrcSchema()` here for 
this constant so callers can use it directly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to