HeartSaVioR commented on code in PR #49488: URL: https://github.com/apache/spark/pull/49488#discussion_r1948008795
########## sql/connect/common/src/main/scala/org/apache/spark/sql/connect/KeyValueGroupedDataset.scala: ########## @@ -526,6 +553,71 @@ private class KeyValueGroupedDatasetImpl[K, V, IK, IV]( } } + override protected[sql] def transformWithStateHelper[U: Encoder, S: Encoder]( + statefulProcessor: StatefulProcessor[K, V, U], + timeMode: TimeMode, + outputMode: OutputMode, + initialState: Option[sql.KeyValueGroupedDataset[K, S]] = None, + eventTimeColumnName: String = ""): Dataset[U] = { + val outputEncoder = agnosticEncoderFor[U] + val stateEncoder = agnosticEncoderFor[S] + + val inputEncoders: Seq[AgnosticEncoder[_]] = Seq(kEncoder, stateEncoder, ivEncoder) + val dummyGroupingFunc = SparkUserDefinedFunction( + function = UdfUtils.noOp[K, U](), + inputEncoders = inputEncoders, + outputEncoder = outputEncoder) + val udf = toExpr( + dummyGroupingFunc.apply( + inputEncoders.map(_ => col("*")): _*)).getCommonInlineUserDefinedFunction + + val initialStateImpl = if (initialState.isDefined) { + assert(initialState.get.isInstanceOf[KeyValueGroupedDatasetImpl[K, S, _, _]]) Review Comment: That's why I'm asking to see whether this can be triggered by Spark's bug or users' bug. If this is former, this is not a huge problem as long as we don't lose debuggability on this (I don't expect users to debug on their own). If this is latter, this should be definitely classified as error class. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org