jingz-db commented on code in PR #49488:
URL: https://github.com/apache/spark/pull/49488#discussion_r1950010855


##########
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/KeyValueGroupedDataset.scala:
##########
@@ -526,6 +553,71 @@ private class KeyValueGroupedDatasetImpl[K, V, IK, IV](
     }
   }
 
+  override protected[sql] def transformWithStateHelper[U: Encoder, S: Encoder](
+      statefulProcessor: StatefulProcessor[K, V, U],
+      timeMode: TimeMode,
+      outputMode: OutputMode,
+      initialState: Option[sql.KeyValueGroupedDataset[K, S]] = None,
+      eventTimeColumnName: String = ""): Dataset[U] = {
+    val outputEncoder = agnosticEncoderFor[U]
+    val stateEncoder = agnosticEncoderFor[S]
+
+    val inputEncoders: Seq[AgnosticEncoder[_]] = Seq(kEncoder, stateEncoder, 
ivEncoder)
+    val dummyGroupingFunc = SparkUserDefinedFunction(
+      function = UdfUtils.noOp[K, U](),
+      inputEncoders = inputEncoders,
+      outputEncoder = outputEncoder)
+    val udf = toExpr(
+      dummyGroupingFunc.apply(
+        inputEncoders.map(_ => col("*")): 
_*)).getCommonInlineUserDefinedFunction
+
+    val initialStateImpl = if (initialState.isDefined) {
+      assert(initialState.get.isInstanceOf[KeyValueGroupedDatasetImpl[K, S, _, 
_]])

Review Comment:
   > I'd argue that the ClassCastException provides the user with more 
information than an assert that fails without an explanation.
   
   > That's why I'm asking to see whether this can be triggered by Spark's bug 
or users' bug. If this is former, this is not a huge problem as long as we 
don't lose debuggability on this (I don't expect users to debug on their own). 
If this is latter, this should be definitely classified as error class.
   
   Looks like in this case ClassCastException would provide enough info and no 
need to let assert swallow useful hints. I am removing the assert here and let 
it throw ClassCastException. This is a Spark bug and not a user bug. I am going 
to remove the assertion inside `FlatMapGroupsWithState` as well. It is also 
doing similar assertion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to