Sorry for a very delayed response. @Kurt Yes, this is the goal to have a function created like new Function(...) also be wrapped into CatalogFunction. This would have to be though a temporary function as we cannot represent that as a set of properties. Similar to the createTemporaryView(DataStream stream).
As for the ConnectTableDescriptor I agree this is very similar to CatalogTable. I am not sure though if we should get rid of it. In the end I see it as a builder for a CatalogTable, which is a slightly more internal API, but we might revisit that some time in the future if we find that it makes more sense. @All I updated the FLIP page with some more details from the outcome of the discussions around FLIP-57. Please take a look. I would like to start a vote on this FLIP as soon as the vote on FLIP-57 goes through. Best, Dawid On 19/09/2019 09:24, Kurt Young wrote: > IIUC it's good to see that both serializable (tables description from DDL) > and unserializable (tables with DataStream underneath) tables are treated > unify with CatalogTable. > > Can I also assume functions that either come from a function class (from > DDL) > or function objects (newed by user) will also treated unify with > CatalogFunction? > > This will greatly simplify and unify current API level concepts and design. > > And it seems only one thing left, how do we deal with > ConnectTableDescriptor? > It's actually very similar with serializable CatalogTable, both carry some > text > properties which even are the same. Is there any chance we can further unify > this to CatalogTable? > > object > Best, > Kurt > > > On Thu, Sep 19, 2019 at 3:13 PM Jark Wu <imj...@gmail.com> wrote: > >> Thanks Dawid for the design doc. >> >> In general, I’m +1 to the FLIP. >> >> >> +1 to the single-string and parse way to express object path. >> >> +1 to deprecate registerTableSink & registerTableSource. >> But I would suggest to provide an easy way to register a custom >> source/sink before we drop them (this is another story). >> Currently, it’s not easy to implement a custom connector descriptor. >> >> Best, >> Jark >> >> >>> 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com> 写道: >>> >>> Hi JingsongLee, >>> From my understanding they can. Underneath they will be CatalogTables. >> The >>> difference is the lifetime of the tables. Plus some of the user facing >>> interfaces cannot be persisted e.g. datastream. Therefore we must have a >>> separate methods for that. In the end the temporary tables are held in >>> memory as CatalogTables. >>> Best, >>> Dawid >>> >>> On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com >> .invalid> >>> wrote: >>> >>>> Hi dawid: >>>> Can temporary tables achieve the same capabilities as catalog table? >>>> like statistics: CatalogTableStatistics, CatalogColumnStatistics, >>>> PartitionStatistics >>>> like partition support: we have added some catalog equivalent interfaces >>>> on TableSource/TableSink: getPartitions, getPartitionFieldNames >>>> Maybe it's not a good idea to add these interfaces to >>>> TableSource/TableSink. What do you think? >>>> >>>> Best, >>>> Jingsong Lee >>>> >>>> >>>> ------------------------------------------------------------------ >>>> From:Kurt Young <ykt...@gmail.com> >>>> Send Time:2019年9月18日(星期三) 17:54 >>>> To:dev <dev@flink.apache.org> >>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table >>>> module >>>> >>>> Hi all, >>>> >>>> Sorry to join this party late. Big +1 to this flip, especially for the >>>> dropping >>>> "registerTableSink & registerTableSource" part. These are indeed legacy >>>> and we should try to unify them through CatalogTable after we introduce >>>> the concept of Catalog. >>>> >>>> From my understanding, what we can registered should all be metadata, >>>> TableSource/TableSink should only be the one who is responsible to do >>>> the real work, i.e. reading and writing data according to the schema and >>>> other information like computed column, partition, .e.g. >>>> >>>> Best, >>>> Kurt >>>> >>>> >>>> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <lzljs3620...@aliyun.com >>>> .invalid> >>>> wrote: >>>> >>>>> After some development and thinking, I have a general understanding. >>>>> +1 to registering a source/sink does not fit into the SQL world. >>>>> I am OK to have a deprecated registerTemporarySource/Sink to compatible >>>>> with old ways. >>>>> >>>>> Best, >>>>> Jingsong Lee >>>>> >>>>> >>>>> ------------------------------------------------------------------ >>>>> From:Timo Walther <twal...@apache.org> >>>>> Send Time:2019年9月17日(星期二) 08:00 >>>>> To:dev <dev@flink.apache.org> >>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table >>>>> module >>>>> >>>>> Hi Dawid, >>>>> >>>>> thanks for the design document. It fixes big concept gaps due to >>>>> historical reasons with proper support for serializability and catalog >>>>> support in mind. >>>>> >>>>> I would not mind a registerTemporarySource/Sink, but the problem that I >>>>> see is that many people think that this is the recommended way of >>>>> registering a table source/sink which is not true. We should guide >> users >>>>> to either use connect() or DDL API which can be validated and stored in >>>>> catalog. >>>>> >>>>> Also from a concept perspective, registering a source/sink does not fit >>>>> into the SQL world. SQL does not know about source/sinks but only about >>>>> tables. If the responsibility of a TableSource/TableSink is just a pure >>>>> physical data consumer/producer that is not connected to the actual >>>>> logical table schema, we would need a possibility of defining time >>>>> attributes and interpreting/converting a changelog. This should be done >>>>> by the framework with information from the DDL/connect() and not be >>>>> defined in every table source. >>>>> >>>>> Regards, >>>>> Timo >>>>> >>>>> >>>>> On 09.09.19 14:16, JingsongLee wrote: >>>>>> Hi dawid: >>>>>> >>>>>> It is difficult to describe specific examples. >>>>>> Sometimes users will generate some java converters through some >>>>>> Java code, or generate some Java classes through third-party >>>>>> libraries. Of course, these can be best done through properties. >>>>>> But this requires additional work from users.My suggestion is to >>>>>> keep this Java instance class way that is user-friendly. >>>>>> >>>>>> Best, >>>>>> Jingsong Lee >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------ >>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org> >>>>>> Send Time:2019年9月6日(星期五) 16:21 >>>>>> To:dev <dev@flink.apache.org> >>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table >>>>> module >>>>>> Hi all, >>>>>> @Jingsong Could you elaborate a bit more what do you mean by >>>>>> "some Connectors are difficult to convert all states to properties" >>>>>> All the Flink provided connectors will definitely be expressible with >>>>> properties (In the end you should be able to use them from DDL). I >> think >>>> if >>>>> a TableSource is complex enough that it handles filter push down, >>>> partition >>>>> support etc. should rather be made available both from DDL & java/scala >>>>> code. I'm happy to reconsider adding registerTemporaryTable(String >> path, >>>>> TableSource source) if you have some concrete examples in mind. >>>>>> >>>>>> @Xuefu: We also considered the ObjectIdentifier (or actually >>>> introducing >>>>> a new identifier representation to differentiate between resolved and >>>>> unresolved identifiers) with the same concerns. We decided to suggest >> the >>>>> string & parsing logic because of usability. >>>>>> tEnv.from("cat.db.table") >>>>>> is shorter and easier to write than >>>>>> tEnv.from(Identifier.for("cat", "db", "name") >>>>>> And also implicitly solves the problem what happens if a user (e.g. >>>> used >>>>> to other systems) uses that API in a following manner: >>>>>> tEnv.from(Identifier.for("db.name") >>>>>> I'm happy to revisit it if the general consensus is that it's better >> to >>>>> use the OO aproach. >>>>>> Best, >>>>>> Dawid >>>>>> >>>>>> On 06/09/2019 10:00, Xuefu Z wrote: >>>>>> >>>>>> Thanks to Dawid for starting the discussion and writeup. It looks >>>> pretty >>>>>> good to me except that I'm a little concerned about the object >>>> reference >>>>>> and string parsing in the code, which seems to an anti-pattern to OOP. >>>>> Have >>>>>> we considered using ObjectIdenitifier with optional catalog and db >>>> parts, >>>>>> esp. if we are worried about arguments of variable length or method >>>>>> overloading? It's quite likely that the result of string parsing is an >>>>>> ObjectIdentifier instance any way. >>>>>> >>>>>> Having string parsing logic in the code is a little dangerous as it >>>>>> duplicates part of the DDL/DML parsing, and they can easily get out of >>>>> sync. >>>>>> Thanks, >>>>>> Xuefu >>>>>> >>>>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <lzljs3620...@aliyun.com >>>>> .invalid> >>>>>> wrote: >>>>>> >>>>>> >>>>>> Thanks dawid, +1 for this approach. >>>>>> >>>>>> One concern is the removal of registerTableSink & registerTableSource >>>>>> in TableEnvironment. It has two alternatives: >>>>>> 1.the properties approach (DDL, descriptor). >>>>>> 2.from/toDataStream. >>>>>> >>>>>> #1 can only be properties, not java states, and some Connectors >>>>>> are difficult to convert all states to properties. >>>>>> #2 can contain java state. But can't use TableSource-related features, >>>>>> like project & filter push down, partition support, etc.. >>>>>> >>>>>> Any idea about this? >>>>>> >>>>>> Best, >>>>>> Jingsong Lee >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------ >>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org> >>>>>> Send Time:2019年9月4日(星期三) 22:20 >>>>>> To:dev <dev@flink.apache.org> >>>>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table >>>> module >>>>>> Hi all, >>>>>> As part of FLIP-30 a Catalog API was introduced that enables storing >>>>> table >>>>>> meta objects permanently. At the same time the majority of current >> APIs >>>>>> create temporary objects that cannot be serialized. We should clarify >>>> the >>>>>> creation of meta objects (tables, views, functions) in a unified way. >>>>>> Another current problem in the API is that all the temporary objects >>>> are >>>>>> stored in a special built-in catalog, which is not very intuitive for >>>>> many >>>>>> users, as they must be aware of that catalog to reference temporary >>>>> objects. >>>>>> Lastly, different APIs have different ways of providing object paths: >>>>>> >>>>>> String path…, >>>>>> String path, String pathContinued… >>>>>> String name >>>>>> We should choose one approach and unify it across all APIs. >>>>>> I suggest a FLIP to address the above issues. >>>>>> Looking forward to your opinions. >>>>>> FLIP link: >>>>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module >>>> >>
signature.asc
Description: OpenPGP digital signature