jingz-db commented on code in PR #49560:
URL: https://github.com/apache/spark/pull/49560#discussion_r1963043578


##########
sql/connect/common/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -1031,6 +1031,26 @@ message GroupMap {
 
   // (Optional) The schema for the grouped state.
   optional DataType state_schema = 10;
+
+  // Below fields are used by TransformWithState and TransformWithStateInPandas
+  // (Optional) TransformWithState related parameters.
+  optional TransformWithStateInfo transform_with_state_info = 11;
+}
+
+// Additional input parameters used for TransformWithState operator.
+message TransformWithStateInfo {

Review Comment:
   In the `GroupMap` proto schema here, TransformWithState and 
FlatmapGroupsWithState shares the following 6 fields:
   ```
   // (Required) Input relation for Group Map API: apply, applyInPandas.
     Relation input = 1;
   
     // (Required) Expressions for grouping keys.
     repeated Expression grouping_expressions = 2;
   
     // (Required) Input user-defined function.
     CommonInlineUserDefinedFunction func = 3;
   
     // (Optional) Expressions for sorting. Only used by Scala Sorted Group Map 
API.
     repeated Expression sorting_expressions = 4;
   
     // Below fields are only used by (Flat)MapGroupsWithState
     // (Optional) Input relation for initial State.
     Relation initial_input = 5;
   
     // (Optional) Expressions for grouping keys of the initial state input 
relation.
     repeated Expression initial_grouping_expressions = 6;
   ```
   And TransformWithState has three additional fields defined in the 
`TransformWithStateInfo` that is not used by FMGWS. FMGWS has the following 3 
fields that is also not used by TransformWithState:
   ```
   // (Optional) True if MapGroupsWithState, false if FlatMapGroupsWithState.
     optional bool is_map_groups_with_state = 7;
   
   // (Optional) Timeout configuration for groups that do not receive data for 
a while.
     optional string timeout_conf = 9;
   
     // (Optional) The schema for the grouped state.
     optional DataType state_schema = 10;
   ```
   
   So it is actually still slightly more benefits/sharing fields to keep 
TransformWithState inside the `GroupMap`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to