As there was already a discussion in the doc, I'll just summarize my opinions here on the proposed execution of this FLIP.
I think we should rather avoid exposing internal details, which I consider Calcite to be part of, but rather reuse what we already have to define an AST from Table API, which is what I'll refer in this mail as Operation tree. First of all, the reason I think this FLIP is not a good idea is that it proposes is to expose types out of our control, so an API we cannot control and we may realistically never be able to stabilize. A Calcite bump in the table project is already pretty hard today, as shown by tasks like that https://github.com/apache/flink/pull/13577. This will make them even harder. Essentially it will couple us to Calcite even more, and create a different but still big maintenance/complexity burden we would like to get rid of with this FLIP. There are also some technical aspects that seem to me a bit overlooked here: * What about Scala? Is flink-table-planner-spi going to be a scala module with the related suffix? Because I see you want to expose a couple of types which we have implemented with Scala right now, and making this module Scala dependent makes even more complicated shipping both modules that use it and flink-table-planner-loader. * Are you sure exposing the Calcite interfaces is going to be enough? Don't you also require some instance specific methods? E.g. FlinkTypeFactory#toLogicalType? What if at some point you need to expose something like FlinkTypeFactory? How do you plan to support it and stabilize it in the long term? Now let me talk a bit about the Operation tree. For who doesn't know what it is, it's the pure Flink AST for defining DML, used for converting the Table API DSL to an AST the planner can manipulate. Essentially, it's our own version of the RelNode tree/RexNode tree. This operation tree can be used already by Hive, without any API changes on Table side. You just need a downcast of TableEnvironmentImpl to use getPlanner() and use Planner#translate, or alternatively you can add getPlanner to TableEnvironmentInternal directly. From what I've seen about your use case, and please correct me if I'm wrong, you can implement your SQL -> Operation tree layer without substantial changes on both sides. The reason why I think this is a better idea, rather than exposing Calcite and RelNodes directly, is: * Aforementioned downsides of exposing Calcite * It doesn't require a new API to get you started with it * Doesn't add complexity on planner side, just removes it from the existing coupling with hive * Letting another project use the Operation tree will harden it, make it more stable and eventually lead to become public The last point in particular is extremely interesting for the future of the project, as having a stable public Operation tree will allow people to implement other relational based APIs on top of Flink SQL, or manipulate the AST to define new semantics, or even more crazy things we can't think of right now, leading to a broader bigger and more diverse ecosystem. Which is exactly what Hive is doing right now at the end of the day, define a new relational API on top of the Flink Table planner functionalities. On Mon, Mar 28, 2022 at 10:57 AM 罗宇侠(莫辞) <luoyuxia.luoyu...@alibaba-inc.com.invalid> wrote: > Sorry for this email, seems there's some format issue in my email client. > Just ignore it for it's a duplicate of [DISCUSS] FLIP-216 Decouple Hive > connector with Flink planner [1] > > > [1] https://lists.apache.org/thread/6xg33nxrnow5zy7xwqk5nwp00h9gcsbc > > Best regards, > Yuxia------------------------------------------------------------------ > 发件人:罗宇侠<yuxia...@qq.com.INVALID> > 日 期:2022年03月25日 20:19:37 > 收件人:dev<dev@flink.apache.org> > 主 题:FLIP-216 Decouple Hive connector with Flink planner > > Hi, everyone > > > > > I would like to open a discussion about decoupling Hive connector with Flink table planner. > It's a follow-up discussion after Hive syntax discussion[1], but only focus on how to decouple Hive connector. The origin doc is here[2], from which you can see the details work and heated discussion about exposing Calcite API or reuse Operation tree to decouple. > > I have created FLIP-216: Decouple Hive connector with Flink planner[3] for it. > > > > > > Thanks and looking forward to a lively discussion! > > > > [1] https://lists.apache.org/thread/2w046dwl46tf2wy750gzmt0qrcz17z8t > > [2] > https://docs.google.com/document/d/1LMQ_mWfB_mkYkEBCUa2DgCO2YdtiZV7YRs2mpXyjdP4/edit?usp=sharing > > [3] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-216%3A+Decouple+Hive+connector+with+Flink+planner > > > > > Best regards, > Yuxia > > > >