AnishMahto commented on code in PR #50875: URL: https://github.com/apache/spark/pull/50875#discussion_r2096465030
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ########## @@ -2105,6 +2105,12 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor catalog, "table-valued functions") } } + if (u.isStreaming && !resolvedFunc.isStreaming) { Review Comment: These are great questions. First of all, the intention of the STREAM keyword is to convert a batch relation into a streaming relation, with the primary use case being that it can be used to populate downstream streaming tables in pipelines using a batch source. If a relation is _already_ streaming, the STREAM keyword should just be no-op. Second, I _think_ the TVF is responsible for defining whether it's streaming or not on its own. The `TableFunctionRegistry` trait at least is a registry of function names -> `LogicalPlan`, and the `LogicalPlan` backing the TVF can define whether its streaming or not (ex. `Range` is [not streaming](https://github.com/apache/spark/blob/6ad0bd448b6e9d65cdc3bea9145e8c0bea8a4454/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala#L998)). And it looks like the resolution of a TVF is simply replacing it with the corresponding LogicalPlan mapped to that TVF in the registry, which then gets analyzed. With my changes as-is, for TVFs I'm not converting the LogicalPlan (that backs the TVF, as per the function registry) to be streaming - I am just validating that when STREAM is used on a TVF, the resolved LogicalPlan for the TVF is indeed streaming - but I don't even know if any default/builtin functions in the registry are defined by a streaming logical plan anyway. When I think about it, I don't actually think the pipelines module will need the ability to convert TVFs to streaming in the first place. In theory users can always define a table that is populated by a batch TVF, and then STREAM from that table. Hence I decided to just remove `STREAM <tvf>` support for now, we can revisit adding it back later if a concrete use-case comes up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org