nsivabalan commented on a change in pull request #4420:
URL: https://github.com/apache/hudi/pull/4420#discussion_r784786957



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##########
@@ -143,6 +143,12 @@
       .withDocumentation("Determines how to handle updates, deletes to file 
groups that are under clustering."
           + " Default strategy just rejects the update");
 
+  public static final ConfigProperty<String> ASYNC_CLUSTERING_SCHEDULE = 
ConfigProperty
+      .key("hoodie.clustering.schedule.async")
+      .defaultValue("false")
+      .withDocumentation("When set to true, clustering service will be 
attempted for scheduling after each write. Users have to ensure "
+          + "they have a separate job to run async clustering(execution) for 
the one scheduled by this writer");

Review comment:
       I have fixed the config to `hoodie.clustering.schedule.inline` and var 
name to SCHEDULE_INLINE_CLUSTERING since scheduling is done INLINE with this 
config and only execution is async. 

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java
##########
@@ -90,6 +90,12 @@
       .withDocumentation("When set to true, compaction service is triggered 
after each write. While being "
           + " simpler operationally, this adds extra latency on the write 
path.");
 
+  public static final ConfigProperty<String> SCHEDULE_ASYNC_COMPACT = 
ConfigProperty
+      .key("hoodie.compact.schedule.async")
+      .defaultValue("false")

Review comment:
       have responded else where. prefer to keep it as is to be in sync w/ 
existing inline compaction config.

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##########
@@ -177,6 +178,16 @@
       .withDocumentation("Determines how to handle updates, deletes to file 
groups that are under clustering."
           + " Default strategy just rejects the update");
 
+  public static final ConfigProperty<String> SCHEDULE_INLINE_CLUSTERING = 
ConfigProperty
+      .key("hoodie.clustering.schedule.inline")

Review comment:
       responded else where. I wanted to keep it in sync w/ existing inline 
clustering config. 
   "hoodie.clustering.inline", and the variable is named as INLINE_CLUSTERING. 
Guess it goes well w/ how the getter method is named (inlineClusteringEnabled). 
So, prefer to leave it as is. 

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java
##########
@@ -90,6 +90,12 @@
       .withDocumentation("When set to true, compaction service is triggered 
after each write. While being "
           + " simpler operationally, this adds extra latency on the write 
path.");
 
+  public static final ConfigProperty<String> SCHEDULE_ASYNC_COMPACT = 
ConfigProperty
+      .key("hoodie.compact.schedule.async")
+      .defaultValue("false")

Review comment:
       same here. have fixed the naming.

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##########
@@ -1116,13 +1135,16 @@ protected boolean scheduleCleaningAtInstant(String 
instantTime, Option<Map<Strin
 
   /**
    * Executes a clustering plan on a table, serially before or after an 
insert/upsert action.
+   * Schedules clustering inline and may be optionally execute.
    */
-  protected Option<String> inlineCluster(Option<Map<String, String>> 
extraMetadata) {
+  protected Option<String> 
inlineScheduleClusterAndOptionallyExecute(Option<Map<String, String>> 
extraMetadata, boolean executeInline) {
     Option<String> clusteringInstantOpt = scheduleClustering(extraMetadata);
-    clusteringInstantOpt.ifPresent(clusteringInstant -> {
-      // inline cluster should auto commit as the user is never given control
-      cluster(clusteringInstant, true);
-    });
+    if (executeInline) {
+      clusteringInstantOpt.ifPresent(clusteringInstant -> {

Review comment:
       same response as above. 

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##########
@@ -464,27 +464,43 @@ protected void postCommit(HoodieTable<T, I, K, O> table, 
HoodieCommitMetadata me
   }
 
   protected void runTableServicesInline(HoodieTable<T, I, K, O> table, 
HoodieCommitMetadata metadata, Option<Map<String, String>> extraMetadata) {
-    if (config.areAnyTableServicesInline()) {
+    if (config.areAnyTableServicesInline() || 
config.scheduleInlineTableServices()) {
       if (config.isMetadataTableEnabled()) {

Review comment:
       have renamed the methods to be explicit.
   areAnyTableServicesInline -> areAnyTableServicesExecutedInline
   scheduleInlineTableServices -> areAnyTableServicesScheduledInline

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##########
@@ -464,27 +464,43 @@ protected void postCommit(HoodieTable<T, I, K, O> table, 
HoodieCommitMetadata me
   }
 
   protected void runTableServicesInline(HoodieTable<T, I, K, O> table, 
HoodieCommitMetadata metadata, Option<Map<String, String>> extraMetadata) {
-    if (config.areAnyTableServicesInline()) {
+    if (config.areAnyTableServicesInline() || 
config.scheduleInlineTableServices()) {
       if (config.isMetadataTableEnabled()) {
         table.getHoodieView().sync();
       }
       // Do an inline compaction if enabled
       if (config.inlineCompactionEnabled()) {
         runAnyPendingCompactions(table);
         metadata.addMetadata(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"true");
-        inlineCompact(extraMetadata);
+        inlineScheduleCompactAndOptionallyExecute(extraMetadata, 
!config.scheduleInlineCompaction());

Review comment:
       just scheduling is only one line. 
   ```
   Option<String> compactionInstantTimeOpt = scheduleCompaction(extraMetadata);
   ```
   thats the reason did not find a need to create a new method just for 
scheduling inline. Infact I did had it that way, on my first attempt. but did 
not feel the need and later removed it. 
   
   With new naming, I guess its clear that execution is optional. So, I feel we 
should be ok here. 

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##########
@@ -1005,13 +1021,16 @@ protected void rollbackFailedWrites(Map<String, 
Option<HoodiePendingRollbackInfo
 
   /**
    * Performs a compaction operation on a table, serially before or after an 
insert/upsert action.
+   * Scheduling is done inline, and optionally execution as well.
    */
-  protected Option<String> inlineCompact(Option<Map<String, String>> 
extraMetadata) {
+  protected Option<String> 
inlineScheduleCompactAndOptionallyExecute(Option<Map<String, String>> 
extraMetadata, boolean executeInline) {
     Option<String> compactionInstantTimeOpt = 
scheduleCompaction(extraMetadata);

Review comment:
       have responded else where. scheduling is just oneline. does not warrant 
a new method. after renaming the method, I feel we are ok. 

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##########
@@ -464,27 +464,43 @@ protected void postCommit(HoodieTable<T, I, K, O> table, 
HoodieCommitMetadata me
   }
 
   protected void runTableServicesInline(HoodieTable<T, I, K, O> table, 
HoodieCommitMetadata metadata, Option<Map<String, String>> extraMetadata) {
-    if (config.areAnyTableServicesInline()) {
+    if (config.areAnyTableServicesInline() || 
config.scheduleInlineTableServices()) {
       if (config.isMetadataTableEnabled()) {
         table.getHoodieView().sync();
       }
       // Do an inline compaction if enabled
       if (config.inlineCompactionEnabled()) {
         runAnyPendingCompactions(table);
         metadata.addMetadata(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"true");
-        inlineCompact(extraMetadata);
+        inlineScheduleCompactAndOptionallyExecute(extraMetadata, 
!config.scheduleInlineCompaction());
       } else {
         metadata.addMetadata(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"false");
       }
 
+      // if just inline schedule is enabled
+      if (!config.inlineCompactionEnabled() && 
config.scheduleInlineCompaction()
+          && 
!table.getActiveTimeline().getWriteTimeline().filterPendingCompactionTimeline().getInstants().findAny().isPresent())
 {
+        // proceed only if there are no pending compactions

Review comment:
       1. Usually we try to have methods in HoodieActivetimeline that are used 
by many callers so that we can avoid duplication or errors. For one off cases 
like this, we keep it away from HoodieActivetimeline. 
   2. somehow I feel we are over populating HoodieWriteConfig. This is a very 
specific use-case and probably we can leave it here. condition here is very 
specific in the sense that, we need to check if inline is not enabled, but 
schedule inline is enabled. and I don't see this being used anywhere else. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to