featzhang opened a new pull request, #27747: URL: https://github.com/apache/flink/pull/27747
Implemented CSE optimization for Flink SQL code generation. Key changes: **New files:** - `CseUtils.scala` - Utility to scan RexNode trees and identify duplicate sub-expressions - `CseITCase.scala` - Integration tests for CSE (correctness, call count, null handling, nested) - `CseTestFunctions.java` - Test UDFs with AtomicInteger call counters - `CseUtilsTest.scala` - Unit tests for the CSE analyzer **Modified files:** - `ExprCodeGenerator.scala` - Added `cseEnabled`, `cseExprCache`, `cseCandidates` fields and CSE logic in `generateExpression()`. Also changed operand visiting in `visitCall` to go through `generateExpression` (not `accept(this)`) so nested sub-expressions are also cached. - `CalcCodeGenerator.scala` - Added CSE analysis before projection code generation using `CseUtils.findDuplicateSubExpressions()` **Approach:** RexNode digest string is used as expression identity key. Only RexCall nodes appearing more than once are cached. First encounter generates code + stores in local variable. Subsequent encounters return NO_CODE reference to cached variable. ### Build & Test Status (continued) - Added `CseJavaITCase.java` - Pure Java integration test (avoids Scala test compilation dependency chain issues with TableTestBase etc.) - Fixed `flink-rpc-core` jar corruption: `RemoteHandshakeMessage.class` was an ECJ stub due to `@Nonnull` on primitive `int`. Fix: recompile with `javac` manually then `mvn jar:jar install:install`. - Fixed 539 stub classes in `flink-table-planner/target/classes` by manually recompiling all 837 Java main sources with `javac`. - Fixed SPI service files in `target/test-classes/META-INF/services` by removing references to uncompiled Scala test classes. **Test results (all passing):** - `CseJavaITCase`: 5/5 ✅ (testCseCorrectness, testCseCallCount, testNoCseCandidates, testCseWithNullValues, testCseWithNestedCommonSubExpressions) - `CseUtilsTest`: 6/6 ✅ **Known issue:** Scala test compilation (945 files) has 261 pre-existing errors in project test code (FlinkRelMdColumnIntervalTest, WatermarkGeneratorCodeGenTest, etc.), preventing full `mvn test-compile`. This is NOT caused by our CSE changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
