vladimirg-db opened a new pull request, #49835:
URL: https://github.com/apache/spark/pull/49835

   ### What changes were proposed in this pull request?
   
   Fix correctness with UNION/EXCEPT/INTERSECT inside a view.
   
   In the following examples the SQL Parser considers UNION/EXCEPT/INTERSECT 
keywords as aliases and drops the rest of the query:
   
   ```
   spark.sql("CREATE OR REPLACE VIEW v1 AS SELECT 1 AS col1 UNION SELECT 2 
UNION SELECT 3 UNION SELECT 4")
   spark.sql("SELECT * FROM v1").show()
   spark.sql("SELECT * FROM v1").queryExecution.analyzed
   
   spark.sql("CREATE OR REPLACE VIEW v1 AS SELECT 1 AS col1 EXCEPT SELECT 2 
EXCEPT SELECT 1 EXCEPT SELECT 2")
   spark.sql("SELECT * FROM v1").show()
   spark.sql("SELECT * FROM v1").queryExecution.analyzed
   
   spark.sql("CREATE OR REPLACE VIEW t1 AS SELECT 1 AS col1 INTERSECT SELECT 1 
INTERSECT SELECT 2 INTERSECT SELECT 2")
   spark.sql("SELECT * FROM v1").show()
   spark.sql("SELECT * FROM v1").queryExecution.analyzed
   ```
   
   
![image](https://github.com/user-attachments/assets/ef726178-2375-4ebc-a7e3-88f1991d1016)
   
![image](https://github.com/user-attachments/assets/50b4b7ba-bc7d-4fc1-a921-f4cbfcab79a3)
   
![image](https://github.com/user-attachments/assets/85b65325-5dd9-4d74-b46d-8ea203ce1039)
   
   There's no correctness issue associated with regular queries (without the 
view). Apparently that's because we use `ParserInterface.parsePlan` 
(`singleStatement` term in Spark SQL grammar) for [regular 
queries](https://github.com/apache/spark/blob/b968ce1d3ac1b72019b30bf3d4e11d9574ba1205/sql/core/src/main/scala/org/apache/spark/sql/classic/SparkSession.scala#L490)
 and `ParserInterface.parseQuery` (`query` term in Spark SQL grammar) for [view 
bodies](https://github.com/apache/spark/blob/b968ce1d3ac1b72019b30bf3d4e11d9574ba1205/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L986).
 The difference is that `singleStatement` [ends in 
EOF](https://github.com/apache/spark/blob/b968ce1d3ac1b72019b30bf3d4e11d9574ba1205/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4#L144).
   
   ### Why are the changes needed?
   
   Correctness issue fix.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, the results of queries on top of aforementioned views are gonna be 
correct.
   
   ### How was this patch tested?
   
   New `view-correctness` suite.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to