boneanxs commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050599665


   > @boneanxs : Can I get some more clarity around the issue. Do you mean to 
say, if you set "hoodie.datasource.clustering.async.enable=true" with spark 
datasource writes, clustering gets executed inline? or do you mean to say, 
clustering does not get executed only ?
   
   Thanks for your reply, as we sync datasource clustering configure with 
hoodieClusterConfig by this [pr](https://github.com/apache/hudi/pull/4828), now 
if we enable aync clustering by following:
   
   ```scala
   df.option("hoodie.datasource.clustering.async.enable", true)
   ```
   
   the async clustering would not be enabled, this is because: 
https://github.com/apache/hudi/pull/4905/files#diff-8bda4b2174721fd642a5435282834e5d796a320c1d9e1366b27be86bd548d48aL729
   
   ```scala
   asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled 
&&
         parameters.get(ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean)
   ```
   use `ASYNC_CLUSTERING_ENABLE.key` which is `hoodie.clustering.async.enabled` 
to check, I think this code can be removed, so we can use both 
`hoodie.clustering.async.enabled` and 
`hoodie.datasource.clustering.async.enable` to enable async clustering service.
   
   Also, should we also sync compaction configurations same as clustering?  I 
found `HoodieCompactionConfig` only use `hoodie.compact.inline` to trigger sync 
compaction work, while `DataSourceWriteOptions` introduce 
`ASYNC_COMPACT_ENABLE` to enable async compaction work, I'm wonder if we should 
move `ASYNC_COMPACT_ENABLE` to `HoodieCompactionConfig`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to