featzhang opened a new pull request, #27750:
URL: https://github.com/apache/flink/pull/27750

   Implemented CSE optimization for Flink SQL code generation. Key changes:
   
   **New files:**
   - `CseUtils.scala` - Utility to scan RexNode trees and identify duplicate 
sub-expressions
   - `CseITCase.scala` - Integration tests for CSE (correctness, call count, 
null handling, nested)
   - `CseTestFunctions.java` - Test UDFs with AtomicInteger call counters
   - `CseUtilsTest.scala` - Unit tests for the CSE analyzer
   
   **Modified files:**
   - `ExprCodeGenerator.scala` - Added `cseEnabled`, `cseExprCache`, 
`cseCandidates` fields and CSE logic in `generateExpression()`. Also changed 
operand visiting in `visitCall` to go through `generateExpression` (not 
`accept(this)`) so nested sub-expressions are also cached.
   - `CalcCodeGenerator.scala` - Added CSE analysis before projection code 
generation using `CseUtils.findDuplicateSubExpressions()`
   
   **Approach:** RexNode digest string is used as expression identity key. Only 
RexCall nodes appearing more than once are cached. First encounter generates 
code + stores in local variable. Subsequent encounters return NO_CODE reference 
to cached variable.
   
   **Test results (all passing):**
   - `CseJavaITCase`: 5/5 ✅ (testCseCorrectness, testCseCallCount, 
testNoCseCandidates, testCseWithNullValues, 
testCseWithNestedCommonSubExpressions)
   - `CseUtilsTest`: 6/6 ✅
   
   > Replaces #27747 (closed due to force-push after author correction)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to