szehon-ho commented on code in PR #50571:
URL: https://github.com/apache/spark/pull/50571#discussion_r2040394255


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala:
##########
@@ -257,12 +257,14 @@ class FindDataSourceTable(sparkSession: SparkSession) 
extends Rule[LogicalPlan]
       QualifiedTableName(table.identifier.catalog.get, table.database, 
table.identifier.table)
     val catalog = sparkSession.sessionState.catalog
     val dsOptions = DataSourceUtils.generateDatasourceOptions(extraOptions, 
table)
+    val readDataSourceCacheIgnoreOptions: Boolean =
+      SQLConf.get.getConf(SQLConf.READ_DATA_SOURCE_CACHE_IGNORE_OPTIONS)
     catalog.getCachedTable(qualifiedTableName) match {
       case null =>
         val dataSource =
           DataSource(
             sparkSession,
-            // In older version(prior to 2.1) of Spark, the table schema can 
be empty and should be
+            // In older version(prior to 2.1) of Spark,the table schema can be 
empty and should be

Review Comment:
   The space can be put back between the words?  To me, its harder to read 
without it?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -5260,6 +5260,15 @@ object SQLConf {
     .booleanConf
     .createWithDefault(false)
 
+  val READ_DATA_SOURCE_CACHE_IGNORE_OPTIONS =
+    buildConf("spark.sql.legacy.readDataSourceCacheIgnoreOptions")
+      .internal()
+      .doc("When set to true, reading from file source table caches the first 
query plan and " +
+        "ignores subsequent changes in query options, e.g. `delimiters` option 
for csv tables.")

Review Comment:
   optional: not sure if the example adds much value.  We ignore any subsequent 
change to query option right?  I guess example can be more useful where we 
document the options itself.



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala:
##########
@@ -1405,6 +1405,31 @@ abstract class DDLSuite extends QueryTest with 
DDLSuiteBase {
     }
   }
 
+  test("SPARK-51747: When legacy conf enabled data source read ignores option 
change") {

Review Comment:
   to me , it would be clearer to combine the test:
   ```
    Seq("true", "false").foreach { ignoreCache =>
       withSQLConf(SQLConf.READ_DATA_SOURCE_CACHE_IGNORE_OPTIONS.key -> 
ignoreCache) {
   
          if (ignoreCache)
              checkAnswer(...)
          else
              checkAnswer(...)
       }
   ```
   Then the test can just show the two versions of the output.
   
   I see its in many other suite and think its a nice way to reduce the code 
and side-by-side see the config behavior change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to