[ https://issues.apache.org/jira/browse/FLINK-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-11421: ----------------------------------- Labels: auto-deprioritized-major auto-unassigned pull-request-available (was: auto-unassigned pull-request-available stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Add compilation options to allow compiling generated code with JDK compiler > ---------------------------------------------------------------------------- > > Key: FLINK-11421 > URL: https://issues.apache.org/jira/browse/FLINK-11421 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Runtime > Reporter: Liya Fan > Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, > pull-request-available > Original Estimate: 240h > Time Spent: 40m > Remaining Estimate: 239h 20m > > Flink supports some operators (like Calc, Hash Agg, Hash Join, etc.) by code > generation. That is, Flink generates their source code dynamically, and then > compile it into Java Byte Code, which is load and executed at runtime. > > By default, Flink compiles the generated source code by Janino. This is fast, > as the compilation often finishes in hundreds of milliseconds. The generated > Java Byte Code, however, is of poor quality. To illustrate, we use Java > Compiler API (JCA) to compile the generated code. Experiments on TPC-H (1 TB) > queries show that the E2E time can be more than 10% shorter, when operators > are compiled by JCA, despite that it takes more time (a few seconds) to > compile with JCA. > > Therefore, we believe it is beneficial to compile generated code by JCA in > the following scenarios: 1) For batch jobs, the E2E time is relatively long, > so it is worth of spending more time compiling and generating high quality > Java Byte Code. 2) For repeated stream jobs, the generated code will be > compiled once and run many times. Therefore, it pays to spend more time > compiling for the first time, and enjoy the high byte code qualities for > later runs. > > According to the above observations, we want to provide a compilation option > (Janino, JCA, or dynamic) for Flink, so that the user can choose the one > suitable for their specific scenario and obtain better performance whenever > possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)