empcl commented on code in PR #12614: URL: https://github.com/apache/hudi/pull/12614#discussion_r2017819313
########## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala: ########## @@ -534,6 +535,15 @@ object HoodieFileIndex extends Logging { properties.setProperty(RECORDKEY_FIELD.key, tableConfig.getRecordKeyFields.orElse(Array.empty).mkString(",")) properties.setProperty(PRECOMBINE_FIELD.key, Option(tableConfig.getPreCombineField).getOrElse("")) properties.setProperty(PARTITIONPATH_FIELD.key, HoodieTableConfig.getPartitionFieldPropForKeyGenerator(tableConfig).orElse("")) + + // for simple bucket index, we need to set the INDEX_TYPE, BUCKET_INDEX_HASH_FIELD, BUCKET_INDEX_NUM_BUCKETS + val dataBase = Some(tableConfig.getDatabaseName) Review Comment: @danny0405 hello, Is it not user-friendly to require the user to display a specified number of buckets when reading. Can we save the bucket count when writing data, such as writing it to table config, in the scenario where the bucket count is specified in SQL Hint but not specified during table creation? In this way, downstream users do not need to display the specified bucket count when reading the hudi table, as they may not be concerned about the bucket count of this table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org