featzhang opened a new pull request, #27747:
URL: https://github.com/apache/flink/pull/27747

   Implemented CSE optimization for Flink SQL code generation. Key changes:
   
   **New files:**
   - `CseUtils.scala` - Utility to scan RexNode trees and identify duplicate 
sub-expressions
   - `CseITCase.scala` - Integration tests for CSE (correctness, call count, 
null handling, nested)
   - `CseTestFunctions.java` - Test UDFs with AtomicInteger call counters
   - `CseUtilsTest.scala` - Unit tests for the CSE analyzer
   
   **Modified files:**
   - `ExprCodeGenerator.scala` - Added `cseEnabled`, `cseExprCache`, 
`cseCandidates` fields and CSE logic in `generateExpression()`. Also changed 
operand visiting in `visitCall` to go through `generateExpression` (not 
`accept(this)`) so nested sub-expressions are also cached.
   - `CalcCodeGenerator.scala` - Added CSE analysis before projection code 
generation using `CseUtils.findDuplicateSubExpressions()`
   
   **Approach:** RexNode digest string is used as expression identity key. Only 
RexCall nodes appearing more than once are cached. First encounter generates 
code + stores in local variable. Subsequent encounters return NO_CODE reference 
to cached variable.
   
   ### Build & Test Status (continued)
   
   - Added `CseJavaITCase.java` - Pure Java integration test (avoids Scala test 
compilation dependency chain issues with TableTestBase etc.)
   - Fixed `flink-rpc-core` jar corruption: `RemoteHandshakeMessage.class` was 
an ECJ stub due to `@Nonnull` on primitive `int`. Fix: recompile with `javac` 
manually then `mvn jar:jar install:install`.
   - Fixed 539 stub classes in `flink-table-planner/target/classes` by manually 
recompiling all 837 Java main sources with `javac`.
   - Fixed SPI service files in `target/test-classes/META-INF/services` by 
removing references to uncompiled Scala test classes.
   
   **Test results (all passing):**
   - `CseJavaITCase`: 5/5 ✅ (testCseCorrectness, testCseCallCount, 
testNoCseCandidates, testCseWithNullValues, 
testCseWithNestedCommonSubExpressions)
   - `CseUtilsTest`: 6/6 ✅
   
   **Known issue:** Scala test compilation (945 files) has 261 pre-existing 
errors in project test code (FlinkRelMdColumnIntervalTest, 
WatermarkGeneratorCodeGenTest, etc.), preventing full `mvn test-compile`. This 
is NOT caused by our CSE changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to