Thanks for the comments.

@Timo I think your suggestion makes sense. I changed the signature to
void createTemporaryFunction(String path, Class<? extends UserDefinedFunction> 
functionClass);

This will mean that after we make QueryOperations string serializable
and we expose the type inference in functions we will have only a single
object that cannot be persisted in a catalog and thus have only a
"Temporary" method:

void createTemporaryView(String path, DataStream stream);

I updated the FLIP page accordingly. I also listed explicitly methods
that we need to deprecate and added a note that the implementation of
the methods concerning UserDefinedFunctions has to be postponed until we
have type inference in place.

As the voting period for FLIP-57 ended, I will start the VOTE thread for
FLIP-64 tomorrow morning, unless there are some objections until then.

Best,

Dawid


On 10/10/2019 16:16, Kurt Young wrote:
> Thanks for the clarification Timo, that's sounds good to me. Let's continue
> to discuss other things.
>
> Best,
> Kurt
>
>
> On Thu, Oct 10, 2019 at 4:14 PM Timo Walther <twal...@apache.org> wrote:
>
>> The purpose of ConnectTableDescriptor was to have a programmatic way of
>> expressing `CREATE TABLE` with JavaDocs and IDE support. But I agree
>> that it needs some rework in the future, once we have finalized the DDL
>> to ensure that both concepts are in sync.
>>
>> Regards,
>> Timo
>>
>>
>> On 10.10.19 16:08, Kurt Young wrote:
>>> Regarding to ConnectTableDescriptor, if in the end it becomes a builder
>>> of CatalogTable, I would be ok with that. But it doesn't look like a
>> builder
>>> now, it's more like another form of TableSource/TableSink.
>>>
>>> Best,
>>> Kurt
>>>
>>>
>>> On Thu, Oct 10, 2019 at 3:44 PM Timo Walther <twal...@apache.org> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> thanks for the great discussion with a good outcome.
>>>>
>>>> I have one last comment about:
>>>>
>>>> void createTemporaryFunction(String path, UserDefinedFunction function);
>>>>
>>>> We should encourage users to register a class instead of an instance. We
>>>> should enforce a default constructor in the future. If we need metadata
>>>> or type inference, the planner can simply instantiate the class and call
>>>> the corresponding getters. If people would like to parameterize a
>>>> function, they can use global job parameters instead - which are
>>>> available in the open() method.
>>>>
>>>> I suggest:
>>>>
>>>> void createTemporaryFunction(String path, Class<? extends
>>>> UserDefinedFunction> functionClass);
>>>>
>>>> This is also closer to the SQL DDL that is about to be implemented in:
>>>> https://issues.apache.org/jira/browse/FLINK-7151
>>>>
>>>> Thanks,
>>>> Timo
>>>>
>>>>
>>>> On 09.10.19 17:07, Bowen Li wrote:
>>>>> Hi Dawid,
>>>>>
>>>>> +1 for proposed changes
>>>>>
>>>>> On Wed, Oct 9, 2019 at 12:15 PM Dawid Wysakowicz <
>> dwysakow...@apache.org
>>>>> wrote:
>>>>>
>>>>>> Sorry for a very delayed response.
>>>>>>
>>>>>> @Kurt Yes, this is the goal to have a function created like new
>>>>>> Function(...) also be wrapped into CatalogFunction. This would have to
>>>>>> be though a temporary function as we cannot represent that as a set of
>>>>>> properties. Similar to the createTemporaryView(DataStream stream).
>>>>>>
>>>>>> As for the ConnectTableDescriptor I agree this is very similar to
>>>>>> CatalogTable. I am not sure though if we should get rid of it. In the
>>>>>> end I see it as a builder for a CatalogTable, which is a slightly more
>>>>>> internal API, but we might revisit that some time in the future if we
>>>>>> find that it makes more sense.
>>>>>>
>>>>>> @All I updated the FLIP page with some more details from the outcome
>> of
>>>>>> the discussions around FLIP-57. Please take a look. I would like to
>>>>>> start a vote on this FLIP as soon as the vote on FLIP-57 goes through.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Dawid
>>>>>>
>>>>>>
>>>>>> On 19/09/2019 09:24, Kurt Young wrote:
>>>>>>> IIUC it's good to see that both serializable (tables description from
>>>>>> DDL)
>>>>>>> and unserializable (tables with DataStream underneath) tables are
>>>> treated
>>>>>>> unify with CatalogTable.
>>>>>>>
>>>>>>> Can I also assume functions that either come from a function class
>>>> (from
>>>>>>> DDL)
>>>>>>> or function objects (newed by user) will also treated unify with
>>>>>>> CatalogFunction?
>>>>>>>
>>>>>>> This will greatly simplify and unify current API level concepts and
>>>>>> design.
>>>>>>> And it seems only one thing left, how do we deal with
>>>>>>> ConnectTableDescriptor?
>>>>>>> It's actually very similar with serializable CatalogTable, both carry
>>>>>> some
>>>>>>> text
>>>>>>> properties which even are the same. Is there any chance we can
>> further
>>>>>> unify
>>>>>>> this to CatalogTable?
>>>>>>>
>>>>>>> object
>>>>>>> Best,
>>>>>>> Kurt
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 19, 2019 at 3:13 PM Jark Wu <imj...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks Dawid for the design doc.
>>>>>>>>
>>>>>>>> In general, I’m +1 to the FLIP.
>>>>>>>>
>>>>>>>>
>>>>>>>> +1 to the single-string and parse way to express object path.
>>>>>>>>
>>>>>>>> +1 to deprecate registerTableSink & registerTableSource.
>>>>>>>> But I would suggest to provide an easy way to register a custom
>>>>>>>> source/sink before we drop them (this is another story).
>>>>>>>> Currently, it’s not easy to implement a custom connector descriptor.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Jark
>>>>>>>>
>>>>>>>>
>>>>>>>>> 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com>
>> 写道:
>>>>>>>>> Hi JingsongLee,
>>>>>>>>>   From my understanding they can. Underneath they will be
>>>> CatalogTables.
>>>>>>>> The
>>>>>>>>> difference is the lifetime of the tables. Plus some of the user
>>>> facing
>>>>>>>>> interfaces cannot be persisted e.g. datastream. Therefore we must
>>>> have
>>>>>> a
>>>>>>>>> separate methods for that. In the end the temporary tables are held
>>>> in
>>>>>>>>> memory as CatalogTables.
>>>>>>>>> Best,
>>>>>>>>> Dawid
>>>>>>>>>
>>>>>>>>> On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com
>>>>>>>> .invalid>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi dawid:
>>>>>>>>>> Can temporary tables achieve the same capabilities as catalog
>> table?
>>>>>>>>>> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
>>>>>>>>>> PartitionStatistics
>>>>>>>>>> like partition support: we have added some catalog equivalent
>>>>>> interfaces
>>>>>>>>>> on TableSource/TableSink: getPartitions, getPartitionFieldNames
>>>>>>>>>> Maybe it's not a good idea to add these interfaces to
>>>>>>>>>> TableSource/TableSink. What do you think?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Jingsong Lee
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>> From:Kurt Young <ykt...@gmail.com>
>>>>>>>>>> Send Time:2019年9月18日(星期三) 17:54
>>>>>>>>>> To:dev <dev@flink.apache.org>
>>>>>>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in
>>>> Table
>>>>>>>>>> module
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Sorry to join this party late. Big +1 to this flip, especially for
>>>> the
>>>>>>>>>> dropping
>>>>>>>>>> "registerTableSink & registerTableSource" part. These are indeed
>>>>>> legacy
>>>>>>>>>> and we should try to unify them through CatalogTable after we
>>>>>> introduce
>>>>>>>>>> the concept of Catalog.
>>>>>>>>>>
>>>>>>>>>>   From my understanding, what we can registered should all be
>>>> metadata,
>>>>>>>>>> TableSource/TableSink should only be the one who is responsible to
>>>> do
>>>>>>>>>> the real work, i.e. reading and writing data according to the
>> schema
>>>>>> and
>>>>>>>>>> other information like computed column, partition, .e.g.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Kurt
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <
>>>> lzljs3620...@aliyun.com
>>>>>>>>>> .invalid>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> After some development and thinking, I have a general
>>>> understanding.
>>>>>>>>>>> +1 to registering a source/sink does not fit into the SQL world.
>>>>>>>>>>> I am OK to have a deprecated registerTemporarySource/Sink to
>>>>>> compatible
>>>>>>>>>>> with old ways.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Jingsong Lee
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>> ------------------------------------------------------------------
>>>>>>>>>>> From:Timo Walther <twal...@apache.org>
>>>>>>>>>>> Send Time:2019年9月17日(星期二) 08:00
>>>>>>>>>>> To:dev <dev@flink.apache.org>
>>>>>>>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in
>>>> Table
>>>>>>>>>>> module
>>>>>>>>>>>
>>>>>>>>>>> Hi Dawid,
>>>>>>>>>>>
>>>>>>>>>>> thanks for the design document. It fixes big concept gaps due to
>>>>>>>>>>> historical reasons with proper support for serializability and
>>>>>> catalog
>>>>>>>>>>> support in mind.
>>>>>>>>>>>
>>>>>>>>>>> I would not mind a registerTemporarySource/Sink, but the problem
>>>>>> that I
>>>>>>>>>>> see is that many people think that this is the recommended way of
>>>>>>>>>>> registering a table source/sink which is not true. We should
>> guide
>>>>>>>> users
>>>>>>>>>>> to either use connect() or DDL API which can be validated and
>>>> stored
>>>>>> in
>>>>>>>>>>> catalog.
>>>>>>>>>>>
>>>>>>>>>>> Also from a concept perspective, registering a source/sink does
>> not
>>>>>> fit
>>>>>>>>>>> into the SQL world. SQL does not know about source/sinks but only
>>>>>> about
>>>>>>>>>>> tables. If the responsibility of a TableSource/TableSink is just
>> a
>>>>>> pure
>>>>>>>>>>> physical data consumer/producer that is not connected to the
>> actual
>>>>>>>>>>> logical table schema, we would need a possibility of defining
>> time
>>>>>>>>>>> attributes and interpreting/converting a changelog. This should
>> be
>>>>>> done
>>>>>>>>>>> by the framework with information from the DDL/connect() and not
>> be
>>>>>>>>>>> defined in every table source.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Timo
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 09.09.19 14:16, JingsongLee wrote:
>>>>>>>>>>>> Hi dawid:
>>>>>>>>>>>>
>>>>>>>>>>>> It is difficult to describe specific examples.
>>>>>>>>>>>> Sometimes users will generate some java converters through some
>>>>>>>>>>>>    Java code, or generate some Java classes through third-party
>>>>>>>>>>>>    libraries. Of course, these can be best done through
>> properties.
>>>>>>>>>>>> But this requires additional work from users.My suggestion is to
>>>>>>>>>>>>    keep this Java instance class way that is user-friendly.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Jingsong Lee
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>> ------------------------------------------------------------------
>>>>>>>>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
>>>>>>>>>>>> Send Time:2019年9月6日(星期五) 16:21
>>>>>>>>>>>> To:dev <dev@flink.apache.org>
>>>>>>>>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in
>>>>>> Table
>>>>>>>>>>> module
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> @Jingsong Could you elaborate a bit more what do you mean by
>>>>>>>>>>>> "some Connectors are difficult to convert all states to
>>>> properties"
>>>>>>>>>>>> All the Flink provided connectors will definitely be expressible
>>>>>> with
>>>>>>>>>>> properties (In the end you should be able to use them from DDL).
>> I
>>>>>>>> think
>>>>>>>>>> if
>>>>>>>>>>> a TableSource is complex enough that it handles filter push down,
>>>>>>>>>> partition
>>>>>>>>>>> support etc. should rather be made available both from DDL &
>>>>>> java/scala
>>>>>>>>>>> code. I'm happy to reconsider adding
>> registerTemporaryTable(String
>>>>>>>> path,
>>>>>>>>>>> TableSource source) if you have some concrete examples in mind.
>>>>>>>>>>>> @Xuefu: We also considered the ObjectIdentifier (or actually
>>>>>>>>>> introducing
>>>>>>>>>>> a new identifier representation to differentiate between resolved
>>>> and
>>>>>>>>>>> unresolved identifiers) with the same concerns. We decided to
>>>> suggest
>>>>>>>> the
>>>>>>>>>>> string & parsing logic because of usability.
>>>>>>>>>>>>       tEnv.from("cat.db.table")
>>>>>>>>>>>> is shorter and easier to write than
>>>>>>>>>>>>       tEnv.from(Identifier.for("cat", "db", "name")
>>>>>>>>>>>> And also implicitly solves the problem what happens if a user
>>>> (e.g.
>>>>>>>>>> used
>>>>>>>>>>> to other systems) uses that API in a following manner:
>>>>>>>>>>>>       tEnv.from(Identifier.for("db.name")
>>>>>>>>>>>> I'm happy to revisit it if the general consensus is that it's
>>>> better
>>>>>>>> to
>>>>>>>>>>> use the OO aproach.
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Dawid
>>>>>>>>>>>>
>>>>>>>>>>>> On 06/09/2019 10:00, Xuefu Z wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks to Dawid for starting the discussion and writeup. It
>> looks
>>>>>>>>>> pretty
>>>>>>>>>>>> good to me except that I'm a little concerned about the object
>>>>>>>>>> reference
>>>>>>>>>>>> and string parsing in the code, which seems to an anti-pattern
>> to
>>>>>> OOP.
>>>>>>>>>>> Have
>>>>>>>>>>>> we considered using ObjectIdenitifier with optional catalog and
>> db
>>>>>>>>>> parts,
>>>>>>>>>>>> esp. if we are worried about arguments of variable length or
>>>> method
>>>>>>>>>>>> overloading? It's quite likely that the result of string parsing
>>>> is
>>>>>> an
>>>>>>>>>>>> ObjectIdentifier instance any way.
>>>>>>>>>>>>
>>>>>>>>>>>> Having string parsing logic in the code is a little dangerous as
>>>> it
>>>>>>>>>>>> duplicates part of the DDL/DML parsing, and they can easily get
>>>> out
>>>>>> of
>>>>>>>>>>> sync.
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Xuefu
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <
>>>> lzljs3620...@aliyun.com
>>>>>>>>>>> .invalid>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks dawid, +1 for this approach.
>>>>>>>>>>>>
>>>>>>>>>>>> One concern is the removal of registerTableSink &
>>>>>> registerTableSource
>>>>>>>>>>>>    in TableEnvironment. It has two alternatives:
>>>>>>>>>>>> 1.the properties approach (DDL, descriptor).
>>>>>>>>>>>> 2.from/toDataStream.
>>>>>>>>>>>>
>>>>>>>>>>>> #1 can only be properties, not java states, and some Connectors
>>>>>>>>>>>>    are difficult to convert all states to properties.
>>>>>>>>>>>> #2 can contain java state. But can't use TableSource-related
>>>>>> features,
>>>>>>>>>>>> like project & filter push down, partition support, etc..
>>>>>>>>>>>>
>>>>>>>>>>>> Any idea about this?
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Jingsong Lee
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>> ------------------------------------------------------------------
>>>>>>>>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
>>>>>>>>>>>> Send Time:2019年9月4日(星期三) 22:20
>>>>>>>>>>>> To:dev <dev@flink.apache.org>
>>>>>>>>>>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in
>> Table
>>>>>>>>>> module
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> As part of FLIP-30 a Catalog API was introduced that enables
>>>> storing
>>>>>>>>>>> table
>>>>>>>>>>>> meta objects permanently. At the same time the majority of
>> current
>>>>>>>> APIs
>>>>>>>>>>>> create temporary objects that cannot be serialized. We should
>>>>>> clarify
>>>>>>>>>> the
>>>>>>>>>>>> creation of meta objects (tables, views, functions) in a unified
>>>>>> way.
>>>>>>>>>>>> Another current problem in the API is that all the temporary
>>>> objects
>>>>>>>>>> are
>>>>>>>>>>>> stored in a special built-in catalog, which is not very
>> intuitive
>>>>>> for
>>>>>>>>>>> many
>>>>>>>>>>>> users, as they must be aware of that catalog to reference
>>>> temporary
>>>>>>>>>>> objects.
>>>>>>>>>>>> Lastly, different APIs have different ways of providing object
>>>>>> paths:
>>>>>>>>>>>> String path…,
>>>>>>>>>>>> String path, String pathContinued…
>>>>>>>>>>>> String name
>>>>>>>>>>>> We should choose one approach and unify it across all APIs.
>>>>>>>>>>>> I suggest a FLIP to address the above issues.
>>>>>>>>>>>> Looking forward to your opinions.
>>>>>>>>>>>> FLIP link:
>>>>>>>>>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to