linliu-code commented on code in PR #13950:
URL: https://github.com/apache/hudi/pull/13950#discussion_r2384180369


##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestHoodieSparkSqlWriter.scala:
##########
@@ -618,10 +618,20 @@ def testBulkInsertForDropPartitionColumn(): Unit = {
       
.setPartitionFields(fooTableParams(DataSourceWriteOptions.PARTITIONPATH_FIELD.key))
       
.setKeyGeneratorClassProp(fooTableParams.getOrElse(DataSourceWriteOptions.KEYGENERATOR_CLASS_NAME.key,
         DataSourceWriteOptions.KEYGENERATOR_CLASS_NAME.defaultValue()))
-      if(addBootstrapPath) {
-        tableMetaClientBuilder
-          
.setBootstrapBasePath(fooTableParams(HoodieBootstrapConfig.BASE_PATH.key))
-      }
+    if 
(fooTableParams.contains(HoodieWriteConfig.WRITE_PAYLOAD_CLASS_NAME.key())) {

Review Comment:
   I removed the settings and compared the master / this branch behavior. My 
findings confirmed my guess:
   
   1. The default merge mode is Commit time ordering without any merge mode 
configs set. This is the case in the master branch here. This made the table 
config uses `commit_time_ordering` merge mode/strategy id.
   2. Some test cases set `DefaultHoodieRecordPayload` payload class in the 
configs. Then we have merge mode inconsistency: Triple(commit_time merge mode, 
DefaultHoodieRecordPayload, commit_time id)
   3. During the write in master branch, 
`HoodieSparkSqlWriter.mergeParamsAndGetHoodieConfig` function did not trigger 
merge config inference, and the write succeeds.
   
   Here after we allow `strategy id` to be nullable, the merge mode 
inconsistency becomes: `Triple(commit_time merge mode, 
DefaultHoodieRecordPayload, null)`, which triggers the inference in 
`HoodieSparkSqlWriter.mergeParamsAndGetHoodieConfig`, and throws errors in some 
test cases.
   
   Therefore, we need to pass any these configs to the metaclient to avoid such 
inconsistency at the first place. We probably need to fix 
`HoodieSparkSqlWriter.mergeParamsAndGetHoodieConfig` logic in the end. I did 
not touch it in this PR to avoid more complexity.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to