Hi,

I see there was quite some discussion and changes on the FLIP after this VOTE 
was started. I would suggest to start a new voting thread on the current state 
of the FLIP (keeping in mind that a FLIP vote needs at least three 
committer/PMC votes).

For the future, we should probably keep discussion to the [DISCUSS] thread and 
use the vote thread only for voting.

Best,
Aljoscha

> On 3. Oct 2019, at 21:17, Bowen Li <bowenl...@gmail.com> wrote:
> 
> I'm glad to announce that the community has accepted the design of FLIP-57,
> and we are moving forward to implementing it.
> 
> Thanks everyone!
> 
> On Wed, Oct 2, 2019 at 11:01 AM Bowen Li <bowenl...@gmail.com> wrote:
> 
>> Introducing a new term "path" to APIs like
>> "getShortPath(Identifier)/getLongPath(Identifier)" would be confusing to
>> users, thus I feel "getSimpleName/getIdentifier" is fine.
>> 
>> To summarize the discussion result.
>> 
>>   - categorize functions into 2 dimensions - system v.s. catalog,
>>   non-temp v.s. temp - and that give us 4 combinations
>>   - definition of FunctionIdentifier
>> 
>>                     @PublicEvolving
>> 
>> Class FunctionIdentifier {
>> 
>>    String name;
>> 
>>    ObjectIdentifier oi;
>> 
>>        // for temporary/non-temporary system function
>>    public FunctionIdentifier of(String name) {  }
>>    // for temporary/non-temporary catalog function
>>    public FunctionIdentifier of(ObjectIdentifier identifier) {  }
>> 
>> 
>>    Optional<ObjectIdentifier> getIdentifier() {}
>> 
>>    Optional<String> getSimpleName() {}
>> 
>> }
>> 
>> 
>> I've updated them to FLIP wiki. Please take a final look. I'll close the
>> voting if there's no other concern raised within 24 hours.
>> 
>> Cheers
>> 
>> On Wed, Oct 2, 2019 at 4:54 AM Dawid Wysakowicz <dwysakow...@apache.org>
>> wrote:
>> 
>>> Hi,
>>> 
>>> I very much agree with Xuefu's summary of the two points, especially on
>>> the "functionIdentifier doesn't need to reflect the categories".
>>> 
>>> For the factory methods I think methods of should be enough:
>>> 
>>>  // for temporary/non-temporary system function
>>>    public FunctionIdentifier of(String name) {  }
>>>  // for temporary/non-temporary catalog function
>>>    public FunctionIdentifier of(ObjectIdentifier identifier){  }
>>> 
>>> In case of the getters I did not like the method name `getName` in the
>>> original proposal, as in my opinion it could imply that it can return
>>> also just the name part of an ObjectIdentifier, which should not be the
>>> case.
>>> 
>>> I'm fine with getSimpleName/getIdentifier, but want to throw in few
>>> other suggestions:
>>> 
>>> * getShortPath(Identifier)/getLongPath(Identifier),
>>> 
>>> * getSystemPath(Identifier)/getCatalogPath(Identifier)
>>> 
>>> +1 to any of the 3 options.
>>> 
>>> One additional thing the FunctionIdentifier should be a PublicEvolving
>>> class, as it is part of a PublicEvolving APIs e.g. CallExpression, which
>>> user might need to access e.g. in a filter pushdown.
>>> 
>>> I also support the Xuefu's suggestion not to support the "ALL" keyword
>>> in the "SHOW [TEMPORARY] FUNCTIONS" statement, but as the exact design
>>> of it  is not part of the FLIP-57, we do not need to agree on that in
>>> this thread.
>>> 
>>> Overall I think after updating the FLIP with the outcome of the
>>> discussion I vote +1 for it.
>>> 
>>> Best,
>>> 
>>> Dawid
>>> 
>>> 
>>> On 02/10/2019 00:21, Xuefu Z wrote:
>>>> Here are some of my thoughts on the minor debates above:
>>>> 
>>>> 1. +1 for 4 categories of functions. They are categorized along two
>>>> dimensions of binary values: X: *temporary* vs non-temporary
>>> (persistent);
>>>> Y: *system* vs non-system (so said catalog).
>>>> 2. In my opinion, class functionIdentifier doesn't really need to
>>> reflect
>>>> the categories of the functions. Instead, we should decouple them to
>>> make
>>>> the API more stable. Thus, my suggestion is:
>>>> 
>>>> @Internal
>>>> class FunctionIdentifier {
>>>>  // for temporary/non-temporary system function
>>>>    public FunctionIdentifier ofSimpleName(String name) {  }
>>>>  // for temporary/non-temporary catalog function
>>>>    public FunctionIdentifier ofIdentifier(ObjectIdentifier
>>>> identifier){  }
>>>>    public Optional<String> getSimpleName() {  }
>>>>    public Optional<ObjectIdentifier> getIdentifier() {  }
>>>> }
>>>> 3. DDLs -- I don't think we need "ALL" keyword. The grammar can just be:
>>>> 
>>>> SHOW [TEMPORARY] [SYSTEM] FUNCTIONS.
>>>> 
>>>> When either keyword is missing, "ALL" is implied along that dimension.
>>> We
>>>> should always limit the search to the system function catalog and the
>>>> current catalog/DB. I don't see a need of listing functions across
>>>> different catalogs and databases. (It can be added later if that
>>> arises.)
>>>> 
>>>> Thanks,
>>>> Xuefu
>>>> 
>>>> On Tue, Oct 1, 2019 at 11:12 AM Bowen Li <bowenl...@gmail.com> wrote:
>>>> 
>>>>> Hi Dawid,
>>>>> 
>>>>> Thanks for bringing the suggestions up. I was prototyping yesterday and
>>>>> found out those places exactly as what you suggested.
>>>>> 
>>>>> For CallExpression and UnresolvedCallExpression, I've added them to
>>>>> FLIP-57. We will replace ObjectIdentifier with FunctionIdentifier and
>>> mark
>>>>> that as a breaking change
>>>>> 
>>>>> For FunctionIdentifier, the suggested changes LGTM. Just want to bring
>>> up
>>>>> an issue on naming. It seems to me how we now name functions
>>> categories is
>>>>> a bit unclear and confusing, which is reflected on the suggested APIs
>>> - in
>>>>> FunctionIdentifier you lay out, "builtin function" would include
>>> builtin
>>>>> functions and temporary system functions as we are kind of using
>>> "system"
>>>>> and "built-in" interchangeably, and "catalog function" would include
>>>>> catalog functions and temporary functions. I currently can see two
>>>>> approaches to make it clearer to users.
>>>>> 
>>>>> 1) Simplify FunctionIdentifier to be the following. As it's internal,
>>> we
>>>>> add comments and explanation to devs on which case the APIs support.
>>>>> However, I feel this approach would be somehow a bit conflicting to
>>> what
>>>>> you want to achieve for the clarity of APIs
>>>>> 
>>>>> @Internal
>>>>> class FunctionIdentifier {
>>>>>  // for built-in function and temporary system function
>>>>>    public FunctionIdentifier of(String name) {  }
>>>>>  // for temporary function and catalog function
>>>>>    public FunctionIdentifier of(ObjectIdentifier identifier){  }
>>>>>    public Optional<String> getFunctionName() {  }
>>>>>    public Optional<ObjectIdentifier> getObjectIdentifier() {  }
>>>>> }
>>>>> 
>>>>> 2) We can rename our function categories as following so there'll be
>>> mainly
>>>>> just two categories of functions, "system functions" and "catalog
>>>>> functions", either of which can have temporary ones
>>>>> 
>>>>>  - built-in functions -> officially rename to "system functions" and
>>> note
>>>>> to users that "system" and "built-in" can be used interchangeably. We
>>>>> prefer "system" because that's the keyword we decided to use in DDL
>>> that
>>>>> creates its temporary peers ("CREATE TEMPORARY SYSTEM FUNCTION")
>>>>>  - temporary system functions
>>>>>  - catalog functions
>>>>>  - temporary functions  -> rename to "temporary catalog function"
>>>>> 
>>>>> @Internal
>>>>> class FunctionIdentifier {
>>>>>  // for temporary/non-temporary system function
>>>>>    public FunctionIdentifier ofSystemFunction(String name) {  }
>>>>>  // for temporary/non-temporary catalog function
>>>>>    public FunctionIdentifier ofCatalogFunction(ObjectIdentifier
>>>>> identifier){  }
>>>>>    public Optional<String> getSystemFunctionName() {  }
>>>>>    public Optional<ObjectIdentifier> getCatalogFunctionIdentifier()
>>> {  }
>>>>> }
>>>>> 
>>>>> WDYT?
>>>>> 
>>>>> 
>>>>> On Tue, Oct 1, 2019 at 5:48 AM Fabian Hueske <fhue...@gmail.com>
>>> wrote:
>>>>> 
>>>>>> Thanks for the summary Bowen!
>>>>>> 
>>>>>> Looks good to me.
>>>>>> 
>>>>>> Cheers,
>>>>>> Fabian
>>>>>> 
>>>>>> Am Mo., 30. Sept. 2019 um 23:24 Uhr schrieb Bowen Li <
>>>>> bowenl...@gmail.com
>>>>>>> :
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> I've updated the FLIP wiki with the following changes:
>>>>>>> 
>>>>>>> - Lifespan of temp functions are not tied to those of catalogs and
>>>>>>> databases. Users can create temp functions even though catalogs/dbs
>>> in
>>>>>>> their fully qualified names don't even exist.
>>>>>>> - some new SQL commands
>>>>>>>    - "SHOW FUNCTIONS" - list names of temp and non-temp
>>>>> system/built-in
>>>>>>> functions, and names of temp and catalog functions in the current
>>>>> catalog
>>>>>>> and db
>>>>>>>    - "SHOW ALL FUNCTIONS" - list names of temp and non-temp
>>>>> system/built
>>>>>>> functions, and fully qualified names of temp and catalog functions in
>>>>> all
>>>>>>> catalogs and dbs
>>>>>>>    - "SHOW ALL TEMPORARY FUNCTIONS" - list fully qualified names of
>>>>> temp
>>>>>>> functions in all catalog and db
>>>>>>>    - "SHOW ALL TEMPORARY SYSTEM FUNCTIONS" - list names of all temp
>>>>>> system
>>>>>>> functions
>>>>>>> 
>>>>>>> Let me know if you have any questions.
>>>>>>> 
>>>>>>> Seems we have resolved all concerns. If there's no more ones, I'd
>>> like
>>>>> to
>>>>>>> close the vote by this time tomorrow.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Bowen
>>>>>>> 
>>>>>>> On Mon, Sep 30, 2019 at 11:59 AM Bowen Li <bowenl...@gmail.com>
>>> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I think above are some valid points, and we can adopt the
>>>>> suggestions.
>>>>>>>> To elaborate a bit on the new SQL syntax, it would imply that,
>>> unlike
>>>>>>>> "SHOW FUNCTION" which only return function names, "SHOW ALL
>>>>> [TEMPORARY]
>>>>>>>> FUNCTIONS" would return functions' fully qualified names with
>>> catalog
>>>>>> and
>>>>>>>> db names.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Sep 30, 2019 at 6:38 AM Timo Walther <twal...@apache.org>
>>>>>> wrote:
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> I support Fabian's arguments. In my opinion, temporary objects
>>>>> should
>>>>>>>>> just be an additional layer on top of the regular catalog/database
>>>>>>>>> lookup logic. Thus, a temporary table or function has always
>>> highest
>>>>>>>>> precedence and should be stable within the local session. Otherwise
>>>>> it
>>>>>>>>> could magically disappear while someone else is performing
>>>>>> modifications
>>>>>>>>> in the catalog.
>>>>>>>>> 
>>>>>>>>> Furthermore, this feature is very useful for prototyping as users
>>>>> can
>>>>>>>>> simply express that a catalog/database is present even through they
>>>>>>>>> might not have access to it currently.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Timo
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 30.09.19 14:57, Fabian Hueske wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>> 
>>>>>>>>>> Sorry for the late reply.
>>>>>>>>>> 
>>>>>>>>>> I think it might lead to confusing situations if temporary
>>>>> functions
>>>>>>> (or
>>>>>>>>>> any temporary db objects for that matter) are bound to the life
>>>>>> cycle
>>>>>>>>> of an
>>>>>>>>>> (external) db/catalog.
>>>>>>>>>> Imaging a situation where you create a temp function in a db in an
>>>>>>>>> external
>>>>>>>>>> catalog and use it but at some point it does not work anymore
>>>>>> because
>>>>>>>>> some
>>>>>>>>>> other dropped the database from the external catalog.
>>>>>>>>>> Shouldn't temporary objects be only controlled by the owner of a
>>>>>>>>> session?
>>>>>>>>>> I agree that creating temp objects in non-existing db/catalogs
>>>>>> sounds
>>>>>>> a
>>>>>>>>> bit
>>>>>>>>>> strange, but IMO the opposite (the db/catalog must exist for a
>>>>> temp
>>>>>>>>>> function to be created/exist) can have significant implications
>>>>> like
>>>>>>> the
>>>>>>>>>> one I described.
>>>>>>>>>> I think it would be quite easy for users to understand that
>>>>>> temporary
>>>>>>>>>> objects are solely owned by them (and their session).
>>>>>>>>>> The problem of listing temporary objects could be solved by
>>>>> adding a
>>>>>>> ALL
>>>>>>>>>> [TEMPORARY] clause:
>>>>>>>>>> 
>>>>>>>>>> SHOW ALL FUNCTIONS; could show all functions regardless of the
>>>>>>>>>> catalog/database including temporary functions.
>>>>>>>>>> SHOW ALL TEMPORARY FUNCTIONS; could show all temporary functions
>>>>>>>>> regardless
>>>>>>>>>> of the catalog/database.
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Fabian
>>>>>>>>>> 
>>>>>>>>>> Am Sa., 28. Sept. 2019 um 02:21 Uhr schrieb Bowen Li <
>>>>>>>>> bowenl...@gmail.com>:
>>>>>>>>>>> @Dawid, do you have any other concerns? If not, I hope we can
>>>>> close
>>>>>>> the
>>>>>>>>>>> voting.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Sep 26, 2019 at 8:14 PM Rui Li <lirui.fu...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>> I'm not sure how much benefit #1 can bring us. If users just
>>>>> want
>>>>>> to
>>>>>>>>> try
>>>>>>>>>>>> out temporary functions, they can create temporary system
>>>>>> functions
>>>>>>>>> which
>>>>>>>>>>>> don't require a catalog/DB. IIUC, the main reason why we allow
>>>>>>>>> temporary
>>>>>>>>>>>> catalog function is to let users override permanent catalog
>>>>>>> functions.
>>>>>>>>>>>> Therefore a temporary function in a non-existing catalog won't
>>>>>> serve
>>>>>>>>> that
>>>>>>>>>>>> purpose. Besides, each session is provided with a default
>>>>> catalog
>>>>>>> and
>>>>>>>>> DB.
>>>>>>>>>>>> So even if users simply want to create some catalog functions
>>>>> they
>>>>>>> can
>>>>>>>>>>>> forget about after the session, wouldn't the default catalog/DB
>>>>> be
>>>>>>>>> enough
>>>>>>>>>>>> for such experiments?
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Sep 26, 2019 at 4:38 AM Bowen Li <bowenl...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>>> Re 1) As described in the FLIP, a temp function lookup will
>>>>> first
>>>>>>>>> make
>>>>>>>>>>>> sure
>>>>>>>>>>>>> the db exists. If the db doesn't exist, a lazy drop is
>>>>> triggered
>>>>>> to
>>>>>>>>>>>> remove
>>>>>>>>>>>>> that temp function.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I agree Hive doesn't handle it consistently, and we are not
>>>>>> copying
>>>>>>>>>>> Hive.
>>>>>>>>>>>>> IMHO, allowing registering temp functions in nonexistent
>>>>>> catalog/db
>>>>>>>>> is
>>>>>>>>>>>>> hacky and problematic. For instance, "SHOW FUNCTIONS" would
>>>>> list
>>>>>>>>> system
>>>>>>>>>>>>> functions and functions in the current catalog/db, since users
>>>>>>> cannot
>>>>>>>>>>>>> designate a nonexistent catalog/db as current ones, how can
>>>>> they
>>>>>>> list
>>>>>>>>>>>>> functions in nonexistent catalog/db? They may end up never
>>>>>> knowing
>>>>>>>>> what
>>>>>>>>>>>>> temp functions they've created unless trying out with queries
>>>>> or
>>>>>> we
>>>>>>>>>>>>> introducing some more nonstandard SQL statements. The same
>>>>>> applies
>>>>>>> to
>>>>>>>>>>>> other
>>>>>>>>>>>>> temp objects like temp tables.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Re 2) A standalone FunctionIdentifier sounds good to me
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Wed, Sep 25, 2019 at 4:46 AM Dawid Wysakowicz <
>>>>>>>>>>>>> wysakowicz.da...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Ad. 1
>>>>>>>>>>>>>> I wouldn't say it is hacky.
>>>>>>>>>>>>>> Moreover, how do you want ensure that the dB always exists
>>>>> when
>>>>>> a
>>>>>>>>>>>>> temporary
>>>>>>>>>>>>>> object is used?( in this particular case function). Do you
>>>>> want
>>>>>> to
>>>>>>>>>>>> query
>>>>>>>>>>>>>> for the database existence whenever e.g a temporary function
>>>>> is
>>>>>>>>>>> used? I
>>>>>>>>>>>>>> think important aspect here is that the database can be
>>>>> dropped
>>>>>>> from
>>>>>>>>>>>>>> external system, not just flink or a different flink session.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> E.g in case of hive, you cannot create a temporary table in a
>>>>>>>>>>> database
>>>>>>>>>>>>> that
>>>>>>>>>>>>>> does not exist, that's true. But if you create a temporary
>>>>> table
>>>>>>> in
>>>>>>>>> a
>>>>>>>>>>>>>> database and drop that database from a different session, you
>>>>>> can
>>>>>>>>>>> still
>>>>>>>>>>>>>> query the previously created temporary table from the original
>>>>>>>>>>> session.
>>>>>>>>>>>>> It
>>>>>>>>>>>>>> does not sound like a consistent behaviour to me. Why don't we
>>>>>>> make
>>>>>>>>>>>> this
>>>>>>>>>>>>>> behaviour of not binding a temporary objects to the lifetime
>>>>> of
>>>>>> a
>>>>>>>>>>>>> database
>>>>>>>>>>>>>> explicit as part of the temporary objects contract? In the end
>>>>>>> they
>>>>>>>>>>>> exist
>>>>>>>>>>>>>> in different layers. Permanent objects & databases in a
>>>>> catalog
>>>>>>> (in
>>>>>>>>>>>> case
>>>>>>>>>>>>> of
>>>>>>>>>>>>>> hive megastore) whereas temporary objects in flink sessions.
>>>>>>> That's
>>>>>>>>>>>> also
>>>>>>>>>>>>>> true for the original hive client. The temporary objects live
>>>>> in
>>>>>>> the
>>>>>>>>>>>> hive
>>>>>>>>>>>>>> client whereas databases are created in the metastore.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Ad.2
>>>>>>>>>>>>>> I'm open for suggestions here. The one thing I wanted to
>>>>> achieve
>>>>>>>>> here
>>>>>>>>>>>> is
>>>>>>>>>>>>> so
>>>>>>>>>>>>>> that we do not change the contract of ObjectIdentifier. One
>>>>>>>>> important
>>>>>>>>>>>>> thing
>>>>>>>>>>>>>> to remember here is that we need the function identifier to be
>>>>>>> part
>>>>>>>>>>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> FunctionDefinition object and not only as the result of the
>>>>>>> function
>>>>>>>>>>>>>> lookup. At some point we want to be able to store
>>>>>> QueryOperations
>>>>>>> in
>>>>>>>>>>>> the
>>>>>>>>>>>>>> catalogs. They can contain function calls within which we need
>>>>>> to
>>>>>>>>>>> have
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> identifier.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I agree my initial suggestion is over complicated. How about
>>>>> we
>>>>>>> have
>>>>>>>>>>>> just
>>>>>>>>>>>>>> the FunctionIdentifier as top level class without making the
>>>>>>>>>>>>>> ObjectIdentifier extend from it? I think it's pretty much the
>>>>>> same
>>>>>>>>>>> what
>>>>>>>>>>>>> you
>>>>>>>>>>>>>> suggested. The only difference is that it would be a top level
>>>>>>> class
>>>>>>>>>>>>> with a
>>>>>>>>>>>>>> more descriptive name.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, 25 Sep 2019, 13:57 Bowen Li, <bowenl...@gmail.com>
>>>>>> wrote:
>>>>>>>>>>>>>>> Sorry, I missed some parts of the solution. The complete
>>>>>>>>>>> alternative
>>>>>>>>>>>> is
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> following, basically having separate APIs in FunctionLookup
>>>>> for
>>>>>>>>>>>>> ambiguous
>>>>>>>>>>>>>>> and precise function lookup since planner is able to tell
>>>>> which
>>>>>>> API
>>>>>>>>>>>> to
>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>> with parsed queries, and have a unified result:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> class FunctionLookup {
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Optional<Result> lookupAmbiguousFunction(String name);
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Optional<Result> lookupPreciseFunction(ObjectIdentifier oi);
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> class Result {
>>>>>>>>>>>>>>>     Optional<ObjectIdentifier>  getObjectIdentifier() { ...
>>>>> }
>>>>>>>>>>>>>>>     String getName() { ... }
>>>>>>>>>>>>>>>     // ...
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Bowen
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 9:42 PM Bowen Li <
>>>>> bowenl...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Hi Dawid,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Re 1): I agree making it easy for users to run experiments
>>>>> is
>>>>>>>>>>>>>> important.
>>>>>>>>>>>>>>>> However, I'm not sure allowing users to register temp
>>>>>> functions
>>>>>>>>>>> in
>>>>>>>>>>>>>>>> nonexistent catalog/db is the optimal way. It seems a bit
>>>>>> hacky,
>>>>>>>>>>>> and
>>>>>>>>>>>>>>> breaks
>>>>>>>>>>>>>>>> the current contract between Flink and users that catalog/db
>>>>>>> must
>>>>>>>>>>>> be
>>>>>>>>>>>>>>> valid
>>>>>>>>>>>>>>>> in order to operate on.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> How about we instead focus on making it convenient to create
>>>>>>>>>>>>> catalogs?
>>>>>>>>>>>>>>>> Users actually can already do it with ease via program or
>>>>> SQL
>>>>>>> CLI
>>>>>>>>>>>>> yaml
>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>>> for an in-memory catalog which has neither extra dependency
>>>>>> nor
>>>>>>>>>>>>>> external
>>>>>>>>>>>>>>>> connections. What we can further improve is DDL for
>>>>> catalogs,
>>>>>>>>>>> and I
>>>>>>>>>>>>>>> raised
>>>>>>>>>>>>>>>> it in discussion of [FLIP 69 - Flink SQL DDL Enhancement]
>>>>>> driven
>>>>>>>>>>> by
>>>>>>>>>>>>>> Terry
>>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> In that case, if users would like to experiment via SQL,
>>>>> they
>>>>>>> can
>>>>>>>>>>>>>> easily
>>>>>>>>>>>>>>>> create an in memory catalog/database using DDL, then play
>>>>> with
>>>>>>>>>>> temp
>>>>>>>>>>>>>>>> functions.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Re 2): For the assumption, IIUIC, function ObjectIdentifier
>>>>>> has
>>>>>>>>>>> not
>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>> resolved when stack call reaches
>>>>>>>>>>> FunctionCatalog#lookupFunction(),
>>>>>>>>>>>>> but
>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>> been parsed?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I agree keeping ObjectIdentifier as-is would be good. I'm ok
>>>>>>> with
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> suggested classes, though making ObjectIdentifier a subclass
>>>>>> of
>>>>>>>>>>>>>>>> FunctionIdentifier seem a bit counter intuitive.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Another potentially simpler way is:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>> // in class FunctionLookup
>>>>>>>>>>>>>>>> class Result {
>>>>>>>>>>>>>>>>     Optional<ObjectIdentifier>  getObjectIdentifier() {
>>>>> ... }
>>>>>>>>>>>>>>>>     String getName() { ... }
>>>>>>>>>>>>>>>>     ...
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 3:41 PM Dawid Wysakowicz <
>>>>>>>>>>>>>>>> wysakowicz.da...@gmail.com> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>> I really like the flip and think it clarifies important
>>>>>> aspects
>>>>>>>>>>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> system.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I have two, I hope small suggestions, which will not take
>>>>>> much
>>>>>>>>>>>> time
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> agree on.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 1. Could we follow the MySQL approach in regards to the
>>>>>>>>>>> existence
>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> cat/db
>>>>>>>>>>>>>>>>> for temporary functions? That means not to check it, so
>>>>> e.g.
>>>>>>>>>>> it's
>>>>>>>>>>>>>>> possible
>>>>>>>>>>>>>>>>> to create a temporary function in a database that does not
>>>>>>>>>>> exist.
>>>>>>>>>>>> I
>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>> it's really useful e.g in cases when user wants to perform
>>>>>>>>>>>>> experiments
>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>> does not have access to the db yet or temporarily does not
>>>>>> have
>>>>>>>>>>>>>>> connection
>>>>>>>>>>>>>>>>> to a catalog.
>>>>>>>>>>>>>>>>> 2. Could we not change the ObjectIdentifier? Could we not
>>>>>>> loosen
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> requirements for all catalog objects such as tables, views,
>>>>>>>>>>> types
>>>>>>>>>>>>> just
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> the functions? It's really important later on from e.g the
>>>>>>>>>>>>>>> serializability
>>>>>>>>>>>>>>>>> perspective. The important aspect of the ObjectIdentifier
>>>>> is
>>>>>>>>>>> that
>>>>>>>>>>>> we
>>>>>>>>>>>>>>> know
>>>>>>>>>>>>>>>>> that it has been resolved. The suggested changes break that
>>>>>>>>>>>>>> assumption.
>>>>>>>>>>>>>>>>> What do you think about adding an interface
>>>>>> FunctionIdentifier
>>>>>>> {
>>>>>>>>>>>>>>>>> String getName();
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> /**
>>>>>>>>>>>>>>>>>   Return 3-part identifier. Empty in case of a built-in
>>>>>>>>>>> function.
>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>> Optional<ObjectIdentifier> getObjectIdentifier()
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> class ObjectIdentifier implements FunctionIdentifier {
>>>>>>>>>>>>>>>>> Optional<ObjectIdentifier> getObjectIdentifier() {
>>>>>>>>>>>>>>>>>  return Optional.of(this);
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> class SystemFunctionIdentifier implements
>>>>> FunctionIdentifier
>>>>>>>>>>> {...}
>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Wed, 25 Sep 2019, 04:50 Xuefu Z, <usxu...@gmail.com>
>>>>>> wrote:
>>>>>>>>>>>>>>>>>> +1. LGTM
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 6:09 AM Terry Wang <
>>>>>>>>>>> zjuwa...@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Terry Wang
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 在 2019年9月24日,上午10:42,Kurt Young <ykt...@gmail.com> 写道:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> Kurt
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 2:30 AM Bowen Li <
>>>>>>>>>>>> bowenl...@gmail.com
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I'd like to start a voting thread for FLIP-57 [1],
>>>>> which
>>>>>>>>>>>>> we've
>>>>>>>>>>>>>>>>> reached
>>>>>>>>>>>>>>>>>>>>> consensus in [2].
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> This voting will be open for minimum 3 days till 6:30pm
>>>>>>>>>>>> UTC,
>>>>>>>>>>>>>> Sep
>>>>>>>>>>>>>>>>> 26.
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Bowen
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>> 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog
>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>> 
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Xuefu Zhang
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> "In Honey We Trust!"
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards!
>>>>>>>>>>>> Rui Li
>>>>>>>>>>>> 
>>>>>>>>> 
>>>> 
>>> 
>>> 

Reply via email to