Sorry for a very delayed response.

@Kurt Yes, this is the goal to have a function created like new
Function(...) also be wrapped into CatalogFunction. This would have to
be though a temporary function as we cannot represent that as a set of
properties. Similar to the createTemporaryView(DataStream stream).

As for the ConnectTableDescriptor I agree this is very similar to
CatalogTable. I am not sure though if we should get rid of it. In the
end I see it as a builder for a CatalogTable, which is a slightly more
internal API, but we might revisit that some time in the future if we
find that it makes more sense.

@All I updated the FLIP page with some more details from the outcome of
the discussions around FLIP-57. Please take a look. I would like to
start a vote on this FLIP as soon as the vote on FLIP-57 goes through.

Best,

Dawid


On 19/09/2019 09:24, Kurt Young wrote:
> IIUC it's good to see that both serializable (tables description from DDL)
> and unserializable (tables with DataStream underneath) tables are treated
> unify with CatalogTable.
>
> Can I also assume functions that either come from a function class (from
> DDL)
> or function objects (newed by user) will also treated unify with
> CatalogFunction?
>
> This will greatly simplify and unify current API level concepts and design.
>
> And it seems only one thing left, how do we deal with
> ConnectTableDescriptor?
> It's actually very similar with serializable CatalogTable, both carry some
> text
> properties which even are the same. Is there any chance we can further unify
> this to CatalogTable?
>
> object
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 3:13 PM Jark Wu <imj...@gmail.com> wrote:
>
>> Thanks Dawid for the design doc.
>>
>> In general, I’m +1 to the FLIP.
>>
>>
>> +1 to the single-string and parse way to express object path.
>>
>> +1 to deprecate registerTableSink & registerTableSource.
>> But I would suggest to provide an easy way to register a custom
>> source/sink before we drop them (this is another story).
>> Currently, it’s not easy to implement a custom connector descriptor.
>>
>> Best,
>> Jark
>>
>>
>>> 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com> 写道:
>>>
>>> Hi JingsongLee,
>>> From my understanding they can. Underneath they will be CatalogTables.
>> The
>>> difference is the lifetime of the tables. Plus some of the user facing
>>> interfaces cannot be persisted e.g. datastream. Therefore we must have a
>>> separate methods for that. In the end the temporary tables are held in
>>> memory as CatalogTables.
>>> Best,
>>> Dawid
>>>
>>> On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com
>> .invalid>
>>> wrote:
>>>
>>>> Hi dawid:
>>>> Can temporary tables achieve the same capabilities as catalog table?
>>>> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
>>>> PartitionStatistics
>>>> like partition support: we have added some catalog equivalent interfaces
>>>> on TableSource/TableSink: getPartitions, getPartitionFieldNames
>>>> Maybe it's not a good idea to add these interfaces to
>>>> TableSource/TableSink. What do you think?
>>>>
>>>> Best,
>>>> Jingsong Lee
>>>>
>>>>
>>>> ------------------------------------------------------------------
>>>> From:Kurt Young <ykt...@gmail.com>
>>>> Send Time:2019年9月18日(星期三) 17:54
>>>> To:dev <dev@flink.apache.org>
>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>>> module
>>>>
>>>> Hi all,
>>>>
>>>> Sorry to join this party late. Big +1 to this flip, especially for the
>>>> dropping
>>>> "registerTableSink & registerTableSource" part. These are indeed legacy
>>>> and we should try to unify them through CatalogTable after we introduce
>>>> the concept of Catalog.
>>>>
>>>> From my understanding, what we can registered should all be metadata,
>>>> TableSource/TableSink should only be the one who is responsible to do
>>>> the real work, i.e. reading and writing data according to the schema and
>>>> other information like computed column, partition, .e.g.
>>>>
>>>> Best,
>>>> Kurt
>>>>
>>>>
>>>> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <lzljs3620...@aliyun.com
>>>> .invalid>
>>>> wrote:
>>>>
>>>>> After some development and thinking, I have a general understanding.
>>>>> +1 to registering a source/sink does not fit into the SQL world.
>>>>> I am OK to have a deprecated registerTemporarySource/Sink to compatible
>>>>> with old ways.
>>>>>
>>>>> Best,
>>>>> Jingsong Lee
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> From:Timo Walther <twal...@apache.org>
>>>>> Send Time:2019年9月17日(星期二) 08:00
>>>>> To:dev <dev@flink.apache.org>
>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>>>> module
>>>>>
>>>>> Hi Dawid,
>>>>>
>>>>> thanks for the design document. It fixes big concept gaps due to
>>>>> historical reasons with proper support for serializability and catalog
>>>>> support in mind.
>>>>>
>>>>> I would not mind a registerTemporarySource/Sink, but the problem that I
>>>>> see is that many people think that this is the recommended way of
>>>>> registering a table source/sink which is not true. We should guide
>> users
>>>>> to either use connect() or DDL API which can be validated and stored in
>>>>> catalog.
>>>>>
>>>>> Also from a concept perspective, registering a source/sink does not fit
>>>>> into the SQL world. SQL does not know about source/sinks but only about
>>>>> tables. If the responsibility of a TableSource/TableSink is just a pure
>>>>> physical data consumer/producer that is not connected to the actual
>>>>> logical table schema, we would need a possibility of defining time
>>>>> attributes and interpreting/converting a changelog. This should be done
>>>>> by the framework with information from the DDL/connect() and not be
>>>>> defined in every table source.
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>>
>>>>> On 09.09.19 14:16, JingsongLee wrote:
>>>>>> Hi dawid:
>>>>>>
>>>>>> It is difficult to describe specific examples.
>>>>>> Sometimes users will generate some java converters through some
>>>>>>  Java code, or generate some Java classes through third-party
>>>>>>  libraries. Of course, these can be best done through properties.
>>>>>> But this requires additional work from users.My suggestion is to
>>>>>>  keep this Java instance class way that is user-friendly.
>>>>>>
>>>>>> Best,
>>>>>> Jingsong Lee
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------
>>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
>>>>>> Send Time:2019年9月6日(星期五) 16:21
>>>>>> To:dev <dev@flink.apache.org>
>>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>>>> module
>>>>>> Hi all,
>>>>>> @Jingsong Could you elaborate a bit more what do you mean by
>>>>>> "some Connectors are difficult to convert all states to properties"
>>>>>> All the Flink provided connectors will definitely be expressible with
>>>>> properties (In the end you should be able to use them from DDL). I
>> think
>>>> if
>>>>> a TableSource is complex enough that it handles filter push down,
>>>> partition
>>>>> support etc. should rather be made available both from DDL & java/scala
>>>>> code. I'm happy to reconsider adding registerTemporaryTable(String
>> path,
>>>>> TableSource source) if you have some concrete examples in mind.
>>>>>>
>>>>>> @Xuefu: We also considered the ObjectIdentifier (or actually
>>>> introducing
>>>>> a new identifier representation to differentiate between resolved and
>>>>> unresolved identifiers) with the same concerns. We decided to suggest
>> the
>>>>> string & parsing logic because of usability.
>>>>>>     tEnv.from("cat.db.table")
>>>>>> is shorter and easier to write than
>>>>>>     tEnv.from(Identifier.for("cat", "db", "name")
>>>>>> And also implicitly solves the problem what happens if a user (e.g.
>>>> used
>>>>> to other systems) uses that API in a following manner:
>>>>>>     tEnv.from(Identifier.for("db.name")
>>>>>> I'm happy to revisit it if the general consensus is that it's better
>> to
>>>>> use the OO aproach.
>>>>>> Best,
>>>>>> Dawid
>>>>>>
>>>>>> On 06/09/2019 10:00, Xuefu Z wrote:
>>>>>>
>>>>>> Thanks to Dawid for starting the discussion and writeup. It looks
>>>> pretty
>>>>>> good to me except that I'm a little concerned about the object
>>>> reference
>>>>>> and string parsing in the code, which seems to an anti-pattern to OOP.
>>>>> Have
>>>>>> we considered using ObjectIdenitifier with optional catalog and db
>>>> parts,
>>>>>> esp. if we are worried about arguments of variable length or method
>>>>>> overloading? It's quite likely that the result of string parsing is an
>>>>>> ObjectIdentifier instance any way.
>>>>>>
>>>>>> Having string parsing logic in the code is a little dangerous as it
>>>>>> duplicates part of the DDL/DML parsing, and they can easily get out of
>>>>> sync.
>>>>>> Thanks,
>>>>>> Xuefu
>>>>>>
>>>>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <lzljs3620...@aliyun.com
>>>>> .invalid>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Thanks dawid, +1 for this approach.
>>>>>>
>>>>>> One concern is the removal of registerTableSink & registerTableSource
>>>>>>  in TableEnvironment. It has two alternatives:
>>>>>> 1.the properties approach (DDL, descriptor).
>>>>>> 2.from/toDataStream.
>>>>>>
>>>>>> #1 can only be properties, not java states, and some Connectors
>>>>>>  are difficult to convert all states to properties.
>>>>>> #2 can contain java state. But can't use TableSource-related features,
>>>>>> like project & filter push down, partition support, etc..
>>>>>>
>>>>>> Any idea about this?
>>>>>>
>>>>>> Best,
>>>>>> Jingsong Lee
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------
>>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
>>>>>> Send Time:2019年9月4日(星期三) 22:20
>>>>>> To:dev <dev@flink.apache.org>
>>>>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>>> module
>>>>>> Hi all,
>>>>>> As part of FLIP-30 a Catalog API was introduced that enables storing
>>>>> table
>>>>>> meta objects permanently. At the same time the majority of current
>> APIs
>>>>>> create temporary objects that cannot be serialized. We should clarify
>>>> the
>>>>>> creation of meta objects (tables, views, functions) in a unified way.
>>>>>> Another current problem in the API is that all the temporary objects
>>>> are
>>>>>> stored in a special built-in catalog, which is not very intuitive for
>>>>> many
>>>>>> users, as they must be aware of that catalog to reference temporary
>>>>> objects.
>>>>>> Lastly, different APIs have different ways of providing object paths:
>>>>>>
>>>>>> String path…,
>>>>>> String path, String pathContinued…
>>>>>> String name
>>>>>> We should choose one approach and unify it across all APIs.
>>>>>> I suggest a FLIP to address the above issues.
>>>>>> Looking forward to your opinions.
>>>>>> FLIP link:
>>>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>>>>
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to