BOOTMGR commented on code in PR #49678: URL: https://github.com/apache/spark/pull/49678#discussion_r1966485874
########## sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala: ########## @@ -2721,6 +2721,25 @@ class DataFrameSuite extends QueryTest parameters = Map("name" -> ".whatever") ) } + + test("SPARK-50994: RDD conversion is performed with execution context") { + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") { Review Comment: @cloud-fan I took a close look at https://github.com/apache/spark/pull/48325 and I see that It takes stab at a bigger problem: `SQLConf` are not propagated when actual execution of RDD happens (when iterator is called) because that is triggered on-demand by user. This PR only ensures that when RDD is computed, It gets correct `SQLConf` but not during iterator traversal. I followed conversation there and I agree with you that all `SQLConf` accesses should have been done during RDD computation (by storing configs locally) but not when iterator is called. I also agree with @bersprockets 's view that fixing it everywhere would be troublesome and there is not guarantee for future additions. I believe that change needs some bigger considerations like how we see interoperability between Dataset and RDD. I am ready to volunteer there. However, I feel this change should ship independently because 1. We need to have correct configs set when RDD computation happens. This is needed regardless of https://github.com/apache/spark/pull/48325 . We can wait for it later. 2. We need to have tracking on Spark UI for stages submitted during RDD computation. For example, Snowflake's official spark connector internally converts DF to RDD for serialising it into CSV format. Due to this, none of the dependent stages are show on Spark UI. Let me know what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org