Pajaraja opened a new pull request, #50546:
URL: https://github.com/apache/spark/pull/50546

   ### What changes were proposed in this pull request?
   
   Enabling the possibility of a CTE referencing the recursive CTE it is inside 
of. This is done by modifying the CTESubstitution file, consisting of two main 
parts:
   - If traverseAndSubstituteCTE is called from resolveCTERelations when 
attempting to resolve a recursive CTE to resolve all the CTEs it references, we 
remember this ancestor rCTE in case any of the child CTEs want to reference it. 
If we encounter another rCTE inside of the rCTE (which is only allowed in the 
anchor), we define it to be the new anchor rCTE.
   - Even though the first part is enough to resolve these CTEs, a new problem 
arises when trying to identify whether a CTE is recursive or not, since if CTE0 
is recursive and CTE1 is a CTE inside CTE0 that references CTE0, the only way 
to tell whether CTE0 is recursive is to check inside CTE1. For this reason we 
decide to inline all non-recursive CTEs inside a recursive CTE so that CTE0 can 
see its self reference.
   
   ### Why are the changes needed?
   
   To make queries that self reference work. An example of such a query is:
   ```
   WITH RECURSIVE t1 AS (
     SELECT 1 AS n
     UNION ALL
     WITH t2 AS (SELECT n + 1 FROM t1 WHERE n < 5)
     SELECT * FROM t2
   ) SELECT * FROM t1;
   ```
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Existing CTEs for this that didn't work before.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to