yihua commented on code in PR #13650:
URL: https://github.com/apache/hudi/pull/13650#discussion_r2299453965


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkBaseIndexSupport.scala:
##########
@@ -173,18 +188,14 @@ abstract class SparkBaseIndexSupport(spark: SparkSession,
       var recordKeyQueries: List[Expression] = List.empty
       var compositeRecordKeys: List[String] = List.empty
       val recordKeyOpt = getRecordKeyConfig
-      
       val isComplexRecordKey = {
+        val keyGeneratorClassName = 
metaClient.getTableConfig.getKeyGeneratorClassName
         val fieldCount = recordKeyOpt.map(recordKeys => 
recordKeys.length).getOrElse(0)
-        val encodeFieldNameConfig = 
metaClient.getTableConfig.getProps.getProperty(
-          
org.apache.hudi.config.HoodieWriteConfig.COMPLEX_KEYGEN_ENCODE_SINGLE_RECORD_KEY_FIELD_NAME.key(),
 
-          
org.apache.hudi.config.HoodieWriteConfig.COMPLEX_KEYGEN_ENCODE_SINGLE_RECORD_KEY_FIELD_NAME.defaultValue().toString
-        ).toBoolean
-        
+        val isUsingComplexKeyGen = isComplexKeyGenerator(keyGeneratorClassName)
         // Consider as complex if:
         // 1. Multiple fields (> 1), OR
-        // 2. Single field with complex keygen encoding enabled
-        (fieldCount > 1) || (fieldCount == 1 && encodeFieldNameConfig)
+        // 2. Using complex key generator with single field
+        fieldCount > 1 || (isUsingComplexKeyGen && fieldCount == 1)

Review Comment:
   @danny0405 on Spark/Java side with Complex key generator, user can use 
specify Complex key generator with a single record key field and single 
partition path field, and the writer still successfully writes the data, and 
the table ends up having the following table configs:
   ```
   hoodie.table.keygenerator.type=COMPLEX
   hoodie.table.partition.fields=partition
   hoodie.table.recordkey.fields=_row_key
   ```
   This can happen if user has a centralized config system to always use 
Complex key generator.  Then the record key is encoded as 
`_row_key:76fa9f9c-a3f5-4d8a-851b-82f07a7ffab1`.
   
   Though such a configuration setup is not recommended, it can technically 
happen, so we cannot check the number of partition fields.  The correct check 
is thus `(isUsingComplexKeyGen && fieldCount == 1)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to