yihua commented on code in PR #13010: URL: https://github.com/apache/hudi/pull/13010#discussion_r2014670266
########## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderOnSpark.scala: ########## @@ -176,4 +184,25 @@ class TestHoodieFileGroupReaderOnSpark extends TestHoodieFileGroupReaderBase[Int assertEquals(expectedOrderingValue, metadataMap.get(HoodieReaderContext.INTERNAL_META_ORDERING_FIELD)) } + + @ParameterizedTest + @EnumSource(classOf[RecordMergeMode]) + @throws[Exception] + def testReadFileGroupInflightData(recordMergeMode: RecordMergeMode): Unit = { + val writeConfigs = new util.HashMap[String, String](getCommonConfigs(recordMergeMode)) + writeConfigs.put(DataSourceWriteOptions.TABLE_TYPE.key(), DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL) + try { + val dataGen = new HoodieTestDataGenerator(0xDEEF) + try { + // One commit; reading one file group containing a base file only + commitToTable(dataGen.generateInserts("001", 100), INSERT.value, writeConfigs) + validateOutputFromFileGroupReader(getStorageConf, getBasePath, dataGen.getPartitionPaths, true, 0, recordMergeMode) + + commitToTable(dataGen.generateUniqueUpdates("003", 100), UPSERT.value, writeConfigs) Review Comment: Could you prepare the table instead of transactions? Also, could this validation be added to existing tests, i.e., adding a new step in existing tests to remove completed detlacommit from the timeline so that data files become inflight and should not be read? So that we don't need this new test and we can have more coverage with different permutations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org