boneanxs commented on pull request #4905: URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050599665
> @boneanxs : Can I get some more clarity around the issue. Do you mean to say, if you set "hoodie.datasource.clustering.async.enable=true" with spark datasource writes, clustering gets executed inline? or do you mean to say, clustering does not get executed only ? Thanks for your reply, as we sync datasource clustering configure with hoodieClusterConfig by this [pr](https://github.com/apache/hudi/pull/4828), now if we enable aync clustering by following: ```scala df.option("hoodie.datasource.clustering.async.enable", true) ``` the async clustering would not be enabled, this is because: https://github.com/apache/hudi/pull/4905/files#diff-8bda4b2174721fd642a5435282834e5d796a320c1d9e1366b27be86bd548d48aL729 ```scala asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled && parameters.get(ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean) ``` use `ASYNC_CLUSTERING_ENABLE.key` which is `hoodie.clustering.async.enabled` to check, I think this code can be removed, so we can use both `hoodie.clustering.async.enabled` and `hoodie.datasource.clustering.async.enable` to enable async clustering service. Also, should we also sync compaction configurations same as clustering? I found `HoodieCompactionConfig` only use `hoodie.compact.inline` to trigger sync compaction work, while `DataSourceWriteOptions` introduce `ASYNC_COMPACT_ENABLE` to enable async compaction work, I'm wonder if we should move `ASYNC_COMPACT_ENABLE` to `HoodieCompactionConfig`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
