Hi hequn, I am very glad to hear that you are interested in this work. As we all know, this process involves a lot. Currently, the migration work has begun. I started with the Kafka connector's dependency on flink-table and moved the related dependencies to flink-table-common. This work is tracked by FLINK-9461. [1] I don't know if it will conflict with what you expect to do, but from the impact I have observed, it will involve many classes that are currently in flink-table.
*Just a statement to prevent unnecessary conflicts.* Thanks, vino. [1]: https://issues.apache.org/jira/browse/FLINK-9461 Hequn Cheng <chenghe...@gmail.com> 于2018年11月24日周六 下午7:20写道: > Hi Timo, > > Thanks for the effort and writing up this document. I like the idea to make > flink-table scala free, so +1 for the proposal! > > It's good to make Java the first-class citizen. For a long time, we have > neglected java so that many features in Table are missed in Java Test > cases, such as this one[1] I found recently. And I think we may also need > to migrate our test cases, i.e, add java tests. > > This definitely is a big change and will break API compatible. In order to > bring a smaller impact on users, I think we should go fast when we migrate > APIs targeted to users. It's better to introduce the user sensitive changes > within a release. However, it may be not that easy. I can help to > contribute. > > Separation of interface and implementation is a good idea. This may > introduce a minimum of dependencies or even no dependencies. I saw your > reply in the google doc. Java8 has already supported static method for > interfaces, I think we can make use of it? > > Best, > Hequn > > [1] https://issues.apache.org/jira/browse/FLINK-11001 > > > On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <twal...@apache.org> wrote: > > > Hi everyone, > > > > thanks for the great feedback so far. I updated the document with the > > input I got so far > > > > @Fabian: I moved the porting of flink-table-runtime classes up in the > list. > > > > @Xiaowei: Could you elaborate what "interface only" means to you? Do you > > mean a module containing pure Java `interface`s? Or is the validation > > logic also part of the API module? Are 50+ expression classes part of > > the API interface or already too implementation-specific? > > > > @Xuefu: I extended the document by almost a page to clarify when we > > should develop in Scala and when in Java. As Piotr said, every new Scala > > line is instant technical debt. > > > > Thanks, > > Timo > > > > > > Am 23.11.18 um 10:29 schrieb Piotr Nowojski: > > > Hi Timo, > > > > > > Thanks for writing this down +1 from my side :) > > > > > >> I'm wondering that whether we can have rule in the interim when Java > > and Scala coexist that dependency can only be one-way. I found that in > the > > current code base there are cases where a Scala class extends Java and > vise > > versa. This is quite painful. I'm thinking if we could say that extension > > can only be from Java to Scala, which will help the situation. However, > I'm > > not sure if this is practical. > > > Xuefu: I’m also not sure what’s the best approach here, probably we > will > > have to work it out as we go. One thing to consider is that from now on, > > every single new code line written in Scala anywhere in Flink-table > (except > > of Flink-table-api-scala) is an instant technological debt. From this > > perspective I would be in favour of tolerating quite big inchonvieneces > > just to avoid any new Scala code. > > > > > > Piotrek > > > > > >> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xuef...@alibaba-inc.com> > wrote: > > >> > > >> Hi Timo, > > >> > > >> Thanks for the effort and the Google writeup. During our external > > catalog rework, we found much confusion between Java and Scala, and this > > Scala-free roadmap should greatly mitigate that. > > >> > > >> I'm wondering that whether we can have rule in the interim when Java > > and Scala coexist that dependency can only be one-way. I found that in > the > > current code base there are cases where a Scala class extends Java and > vise > > versa. This is quite painful. I'm thinking if we could say that extension > > can only be from Java to Scala, which will help the situation. However, > I'm > > not sure if this is practical. > > >> > > >> Thanks, > > >> Xuefu > > >> > > >> > > >> ------------------------------------------------------------------ > > >> Sender:jincheng sun <sunjincheng...@gmail.com> > > >> Sent at:2018 Nov 23 (Fri) 09:49 > > >> Recipient:dev <dev@flink.apache.org> > > >> Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free > > >> > > >> Hi Timo, > > >> Thanks for initiating this great discussion. > > >> > > >> Currently when using SQL/TableAPI should include many dependence. In > > >> particular, it is not necessary to introduce the specific > implementation > > >> dependencies which users do not care about. So I am glad to see your > > >> proposal, and hope when we consider splitting the API interface into a > > >> separate module, so that the user can introduce minimum of > dependencies. > > >> > > >> So, +1 to [separation of interface and implementation; e.g. `Table` & > > >> `TableImpl`] which you mentioned in the google doc. > > >> Best, > > >> Jincheng > > >> > > >> Xiaowei Jiang <xiaow...@gmail.com> 于2018年11月22日周四 下午10:50写道: > > >> > > >>> Hi Timo, thanks for driving this! I think that this is a nice thing > to > > do. > > >>> While we are doing this, can we also keep in mind that we want to > > >>> eventually have a TableAPI interface only module which users can take > > >>> dependency on, but without including any implementation details? > > >>> > > >>> Xiaowei > > >>> > > >>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fhue...@gmail.com> > > wrote: > > >>> > > >>>> Hi Timo, > > >>>> > > >>>> Thanks for writing up this document. > > >>>> I like the new structure and agree to prioritize the porting of the > > >>>> flink-table-common classes. > > >>>> Since flink-table-runtime is (or should be) independent of the API > and > > >>>> planner modules, we could start porting these classes once the code > is > > >>>> split into the new module structure. > > >>>> The benefits of a Scala-free flink-table-runtime would be a > Scala-free > > >>>> execution Jar. > > >>>> > > >>>> Best, Fabian > > >>>> > > >>>> > > >>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther < > > >>>> twal...@apache.org > > >>>>> : > > >>>>> Hi everyone, > > >>>>> > > >>>>> I would like to continue this discussion thread and convert the > > outcome > > >>>>> into a FLIP such that users and contributors know what to expect in > > the > > >>>>> upcoming releases. > > >>>>> > > >>>>> I created a design document [1] that clarifies our motivation why > we > > >>>>> want to do this, how a Maven module structure could look like, and > a > > >>>>> suggestion for a migration plan. > > >>>>> > > >>>>> It would be great to start with the efforts for the 1.8 release > such > > >>>>> that new features can be developed in Java and major refactorings > > such > > >>>>> as improvements to the connectors and external catalog support are > > not > > >>>>> blocked. > > >>>>> > > >>>>> Please let me know what you think. > > >>>>> > > >>>>> Regards, > > >>>>> Timo > > >>>>> > > >>>>> [1] > > >>>>> > > >>>>> > > >>> > > > https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing > > >>>>> > > >>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske: > > >>>>>> Hi Piotr, > > >>>>>> > > >>>>>> thanks for bumping this thread and thanks for Xingcan for the > > >>> comments. > > >>>>>> I think the first step would be to separate the flink-table module > > >>> into > > >>>>>> multiple sub modules. These could be: > > >>>>>> > > >>>>>> - flink-table-api: All API facing classes. Can be later divided > > >>> further > > >>>>>> into Java/Scala Table API/SQL > > >>>>>> - flink-table-planning: involves all planning (basically > everything > > >>> we > > >>>> do > > >>>>>> with Calcite) > > >>>>>> - flink-table-runtime: the runtime code > > >>>>>> > > >>>>>> IMO, a realistic mid-term goal is to have the runtime module and > > >>>> certain > > >>>>>> parts of the planning module ported to Java. > > >>>>>> The api module will be much harder to port because of several > > >>>>> dependencies > > >>>>>> to Scala core classes (the parser framework, tree iterations, > etc.). > > >>>> I'm > > >>>>>> not saying we should not port this to Java, but it is not clear to > > me > > >>>>> (yet) > > >>>>>> how to do it. > > >>>>>> > > >>>>>> I think flink-table-runtime should not be too hard to port. The > code > > >>>> does > > >>>>>> not make use of many Scala features, i.e., it's writing very > > >>> Java-like. > > >>>>>> Also, there are not many dependencies and operators can be > > >>> individually > > >>>>>> ported step-by-step. > > >>>>>> For flink-table-planning, we can have certain packages that we > port > > >>> to > > >>>>> Java > > >>>>>> like planning rules or plan nodes. The related classes mostly > extend > > >>>>>> Calcite's Java interfaces/classes and would be natural choices for > > >>>> being > > >>>>>> ported. The code generation classes will require more effort to > > port. > > >>>>> There > > >>>>>> are also some dependencies in planning on the api module that we > > >>> would > > >>>>> need > > >>>>>> to resolve somehow. > > >>>>>> > > >>>>>> For SQL most work when adding new features is done in the planning > > >>> and > > >>>>>> runtime modules. So, this separation should already reduce > > >>>> "technological > > >>>>>> dept" quite a lot. > > >>>>>> The Table API depends much more on Scala than SQL. > > >>>>>> > > >>>>>> Cheers, Fabian > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingc...@gmail.com>: > > >>>>>> > > >>>>>>> Hi all, > > >>>>>>> > > >>>>>>> I also think about this problem these days and here are my > > thoughts. > > >>>>>>> > > >>>>>>> 1) We must admit that it’s really a tough task to interoperate > with > > >>>> Java > > >>>>>>> and Scala. E.g., they have different collection types (Scala > > >>>> collections > > >>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method > > which > > >>>>> takes > > >>>>>>> Scala functions as parameters. Considering the major part of the > > >>> code > > >>>>> base > > >>>>>>> is implemented in Java, +1 for this goal from a long-term view. > > >>>>>>> > > >>>>>>> 2) The ideal solution would be to just expose a Scala API and > make > > >>> all > > >>>>> the > > >>>>>>> other parts Scala-free. But I am not sure if it could be achieved > > >>> even > > >>>>> in a > > >>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in > > >>>>>>> "flink-table-core" would be a compromise solution. > > >>>>>>> > > >>>>>>> 3) If the community makes the final decision, maybe any new > > features > > >>>>>>> should be added in Java (regardless of the modules), in order to > > >>>> prevent > > >>>>>>> the Scala codes from growing. > > >>>>>>> > > >>>>>>> Best, > > >>>>>>> Xingcan > > >>>>>>> > > >>>>>>> > > >>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski < > > >>> pi...@data-artisans.com> > > >>>>>>> wrote: > > >>>>>>>> Bumping the topic. > > >>>>>>>> > > >>>>>>>> If we want to do this, the sooner we decide, the less code we > will > > >>>> have > > >>>>>>> to rewrite. I have some objections/counter proposals to Fabian's > > >>>>> proposal > > >>>>>>> of doing it module wise and one module at a time. > > >>>>>>>> First, I do not see a problem of having java/scala code even > > within > > >>>> one > > >>>>>>> module, especially not if there are clean boundaries. Like we > could > > >>>> have > > >>>>>>> API in Scala and optimizer rules/logical nodes written in Java in > > >>> the > > >>>>> same > > >>>>>>> module. However I haven’t previously maintained mixed scala/java > > >>> code > > >>>>> bases > > >>>>>>> before, so I might be missing something here. > > >>>>>>>> Secondly this whole migration might and most like will take > longer > > >>>> then > > >>>>>>> expected, so that creates a problem for a new code that we will > be > > >>>>>>> creating. After making a decision to migrate to Java, almost any > > new > > >>>>> Scala > > >>>>>>> line of code will be immediately a technological debt and we will > > >>> have > > >>>>> to > > >>>>>>> rewrite it to Java later. > > >>>>>>>> Thus I would propose first to state our end goal - modules > > >>> structure > > >>>>> and > > >>>>>>> which parts of modules we want to have eventually Scala-free. > > >>> Secondly > > >>>>>>> taking all steps necessary that will allow us to write new code > > >>>>> complaint > > >>>>>>> with our end goal. Only after that we should/could focus on > > >>>>> incrementally > > >>>>>>> rewriting the old code. Otherwise we could be stuck/blocked for > > >>> years > > >>>>>>> writing new code in Scala (and increasing technological debt), > > >>> because > > >>>>>>> nobody have found a time to rewrite some non important and not > > >>>> actively > > >>>>>>> developed part of some module. > > >>>>>>>> Piotrek > > >>>>>>>> > > >>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fhue...@gmail.com> > > >>> wrote: > > >>>>>>>>> Hi, > > >>>>>>>>> > > >>>>>>>>> In general, I think this is a good effort. However, it won't be > > >>> easy > > >>>>>>> and I > > >>>>>>>>> think we have to plan this well. > > >>>>>>>>> I don't like the idea of having the whole code base fragmented > > >>> into > > >>>>> Java > > >>>>>>>>> and Scala code for too long. > > >>>>>>>>> > > >>>>>>>>> I think we should do this one step at a time and focus on > > >>> migrating > > >>>>> one > > >>>>>>>>> module at a time. > > >>>>>>>>> IMO, the easiest start would be to port the runtime to Java. > > >>>>>>>>> Extracting the API classes into an own module, porting them to > > >>> Java, > > >>>>> and > > >>>>>>>>> removing the Scala dependency won't be possible without > breaking > > >>> the > > >>>>> API > > >>>>>>>>> since a few classes depend on the Scala Table API. > > >>>>>>>>> > > >>>>>>>>> Best, Fabian > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <trohrm...@apache.org > >: > > >>>>>>>>> > > >>>>>>>>>> I think that is a noble and honorable goal and we should > strive > > >>> for > > >>>>> it. > > >>>>>>>>>> This, however, must be an iterative process given the sheer > size > > >>> of > > >>>>> the > > >>>>>>>>>> code base. I like the approach to define common Java modules > > >>> which > > >>>>> are > > >>>>>>> used > > >>>>>>>>>> by more specific Scala modules and slowly moving classes from > > >>> Scala > > >>>>> to > > >>>>>>>>>> Java. Thus +1 for the proposal. > > >>>>>>>>>> > > >>>>>>>>>> Cheers, > > >>>>>>>>>> Till > > >>>>>>>>>> > > >>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski < > > >>>>>>> pi...@data-artisans.com> > > >>>>>>>>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>>> Hi, > > >>>>>>>>>>> > > >>>>>>>>>>> I do not have an experience with how scala and java interacts > > >>> with > > >>>>>>> each > > >>>>>>>>>>> other, so I can not fully validate your proposal, but > generally > > >>>>>>> speaking > > >>>>>>>>>> +1 > > >>>>>>>>>>> from me. > > >>>>>>>>>>> > > >>>>>>>>>>> Does it also mean, that we should slowly migrate > > >>>> `flink-table-core` > > >>>>> to > > >>>>>>>>>>> Java? How would you envision it? It would be nice to be able > to > > >>>> add > > >>>>>>> new > > >>>>>>>>>>> classes/features written in Java and so that they can coexist > > >>> with > > >>>>> old > > >>>>>>>>>>> Scala code until we gradually switch from Scala to Java. > > >>>>>>>>>>> > > >>>>>>>>>>> Piotrek > > >>>>>>>>>>> > > >>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <twal...@apache.org> > > >>>> wrote: > > >>>>>>>>>>>> Hi everyone, > > >>>>>>>>>>>> > > >>>>>>>>>>>> as you all know, currently the Table & SQL API is > implemented > > >>> in > > >>>>>>> Scala. > > >>>>>>>>>>> This decision was made a long-time ago when the initital code > > >>> base > > >>>>> was > > >>>>>>>>>>> created as part of a master's thesis. The community kept > Scala > > >>>>>>> because of > > >>>>>>>>>>> the nice language features that enable a fluent Table API > like > > >>>>>>>>>>> table.select('field.trim()) and because Scala allows for > quick > > >>>>>>>>>> prototyping > > >>>>>>>>>>> (e.g. multi-line comments for code generation). The > committers > > >>>>>>> enforced > > >>>>>>>>>> not > > >>>>>>>>>>> splitting the code-base into two programming languages. > > >>>>>>>>>>>> However, nowadays the flink-table module more and more > becomes > > >>> an > > >>>>>>>>>>> important part in the Flink ecosystem. Connectors, formats, > and > > >>>> SQL > > >>>>>>>>>> client > > >>>>>>>>>>> are actually implemented in Java but need to interoperate > with > > >>>>>>>>>> flink-table > > >>>>>>>>>>> which makes these modules dependent on Scala. As mentioned in > > an > > >>>>>>> earlier > > >>>>>>>>>>> mail thread, using Scala for API classes also exposes member > > >>>>> variables > > >>>>>>>>>> and > > >>>>>>>>>>> methods in Java that should not be exposed to users [1]. Java > > is > > >>>>> still > > >>>>>>>>>> the > > >>>>>>>>>>> most important API language and right now we treat it as a > > >>>>>>> second-class > > >>>>>>>>>>> citizen. I just noticed that you even need to add Scala if > you > > >>>> just > > >>>>>>> want > > >>>>>>>>>> to > > >>>>>>>>>>> implement a ScalarFunction because of method clashes between > > >>>> `public > > >>>>>>>>>> String > > >>>>>>>>>>> toString()` and `public scala.Predef.String toString()`. > > >>>>>>>>>>>> Given the size of the current code base, reimplementing the > > >>>> entire > > >>>>>>>>>>> flink-table code in Java is a goal that we might never reach. > > >>>>>>> However, we > > >>>>>>>>>>> should at least treat the symptoms and have this as a > long-term > > >>>> goal > > >>>>>>> in > > >>>>>>>>>>> mind. My suggestion would be to convert user-facing and > runtime > > >>>>>>> classes > > >>>>>>>>>> and > > >>>>>>>>>>> split the code base into multiple modules: > > >>>>>>>>>>>>> flink-table-java {depends on flink-table-core} > > >>>>>>>>>>>> Implemented in Java. Java users can use this. This would > > >>> require > > >>>> to > > >>>>>>>>>>> convert classes like TableEnvironment, Table. > > >>>>>>>>>>>>> flink-table-scala {depends on flink-table-core} > > >>>>>>>>>>>> Implemented in Scala. Scala users can use this. > > >>>>>>>>>>>> > > >>>>>>>>>>>>> flink-table-common > > >>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use > > >>> this. > > >>>> It > > >>>>>>>>>>> contains interface classes such as descriptors, table sink, > > >>> table > > >>>>>>> source. > > >>>>>>>>>>>>> flink-table-core {depends on flink-table-common and > > >>>>>>>>>>> flink-table-runtime} > > >>>>>>>>>>>> Implemented in Scala. Contains the current main code base. > > >>>>>>>>>>>> > > >>>>>>>>>>>>> flink-table-runtime > > >>>>>>>>>>>> Implemented in Java. This would require to convert classes > in > > >>>>>>>>>>> o.a.f.table.runtime but would improve the runtime > potentially. > > >>>>>>>>>>>> What do you think? > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> Regards, > > >>>>>>>>>>>> > > >>>>>>>>>>>> Timo > > >>>>>>>>>>>> > > >>>>>>>>>>>> [1] > > >>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3. > > >>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into- > > >>>>>>> traits-tp21335.html > > >>>>> > > > > >