zhangyue19921010 commented on code in PR #13409:
URL: https://github.com/apache/hudi/pull/13409#discussion_r2289958825


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java:
##########
@@ -622,6 +622,27 @@ private FlinkOptions() {
       .withDescription("Maximum memory in MB for a write task, when the 
threshold hits,\n"
           + "it flushes the max size data bucket to avoid OOM, default 1GB");
 
+  @AdvancedConfig
+  public static final ConfigOption<Boolean> WRITE_BUFFER_SORT_ENABLED = 
ConfigOptions
+      .key("write.partial.sort.enabled")
+      .booleanType()
+      .defaultValue(false) // default no sort
+      .withDescription("Whether to enable partial buffer sort within append 
write function.");

Review Comment:
   Please add more details about the behavior of partial sort, like sort within 
a buffer, but the order of the entire parquet file is not guaranteed



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java:
##########
@@ -622,6 +622,27 @@ private FlinkOptions() {
       .withDescription("Maximum memory in MB for a write task, when the 
threshold hits,\n"
           + "it flushes the max size data bucket to avoid OOM, default 1GB");
 
+  @AdvancedConfig
+  public static final ConfigOption<Boolean> WRITE_BUFFER_SORT_ENABLED = 
ConfigOptions

Review Comment:
   BUFFER_SORT ? We need to unify the name like using buffer sort uniformly or 
using parital sort uniformly



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java:
##########
@@ -622,6 +622,27 @@ private FlinkOptions() {
       .withDescription("Maximum memory in MB for a write task, when the 
threshold hits,\n"
           + "it flushes the max size data bucket to avoid OOM, default 1GB");
 
+  @AdvancedConfig
+  public static final ConfigOption<Boolean> WRITE_BUFFER_SORT_ENABLED = 
ConfigOptions
+      .key("write.partial.sort.enabled")
+      .booleanType()
+      .defaultValue(false) // default no sort
+      .withDescription("Whether to enable partial buffer sort within append 
write function.");
+
+  @AdvancedConfig
+  public static final ConfigOption<String> WRITE_BUFFER_SORT_KEYS = 
ConfigOptions
+      .key("write.partial.sort.keys")
+      .stringType()
+      .noDefaultValue() // default no sort key
+      .withDescription("Sort keys concatenated by comma for partial buffer 
sort in append write function.");
+
+  @AdvancedConfig
+  public static final ConfigOption<Long> WRITE_BUFFER_SIZE = ConfigOptions
+      .key("write.partial.buffer.size")
+      .longType()
+      .defaultValue(1000L) // 1000 records
+      .withDescription("Buffer size of each partition key for partial sort in 
append write function.");

Review Comment:
   same as mention above. we need to take care of :
   1. config name 
   2. var name
   3. docs description



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java:
##########
@@ -622,6 +622,27 @@ private FlinkOptions() {
       .withDescription("Maximum memory in MB for a write task, when the 
threshold hits,\n"
           + "it flushes the max size data bucket to avoid OOM, default 1GB");
 
+  @AdvancedConfig
+  public static final ConfigOption<Boolean> WRITE_BUFFER_SORT_ENABLED = 
ConfigOptions
+      .key("write.partial.sort.enabled")
+      .booleanType()
+      .defaultValue(false) // default no sort
+      .withDescription("Whether to enable partial buffer sort within append 
write function.");
+
+  @AdvancedConfig
+  public static final ConfigOption<String> WRITE_BUFFER_SORT_KEYS = 
ConfigOptions

Review Comment:
   like mentioned above. config is partial.sort but var name is buffer sort. we 
need to unify. By the way both partial sort or buffer sort are all LGTM :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to