chenkovsky commented on code in PR #15603: URL: https://github.com/apache/datafusion/pull/15603#discussion_r2030547375
########## datafusion/physical-plan/src/stream.rs: ########## @@ -362,6 +362,8 @@ pin_project! { #[pin] stream: S, + + transform_schema: bool, Review Comment: I want to learn some experience from spark. for logical plan, I haven't found any logic to handle this problem. https://github.com/apache/spark/blob/75d80c7795ca71d24229010ab04ae740473126aa/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala#L475 for physical plan, spark is much easier, its InternalRow is schemaless. so it will use the schema of physical plan by default. but recordbatch contains schema. https://github.com/apache/spark/blob/75d80c7795ca71d24229010ab04ae740473126aa/sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala#L688 I'm not 100% sure, I think current logical plan and physical plan schema is correct. the root cause is that recordbatch's schema doesn't match physical plan's. so adding an adapter is a proper way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org