Re: Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures

Feng Jin Tue, 02 Jan 2024 23:44:15 -0800

Hi all,

Thank you for the valuable input.
Since there are no new objections or suggestions, I will open a voting
thread in two days.



Best,
Feng


On Thu, Dec 21, 2023 at 6:58 PM Benchao Li <[email protected]> wrote:

> I'm glad to hear that this is in your plan. Sorry that I overlooked
> the PoC link in the FLIP previously, I'll go over the code of PoC, and
> post here if there are any more concerns.
>
> Xuyang <[email protected]> 于2023年12月21日周四 10:39写道：
>
> >
> > Hi, Benchao.
> >
> >
> > When Feng Jin and I tried the poc together, we found that when using
> udaf, Calcite directly using the function's input parameters from
> SqlCall#getOperandList. But in fact, these input parameters may use named
> arguments, the order of parameters may be wrong, and they may not include
> optional parameters that need to set default values. Actually, we should
> use new SqlCallBinding(this, scope, call).operands() to let this method
> correct the order and add default values. (You can see the modification in
> SqlToRelConverter in poc branch[1])
> >
> >
> > We have not reported this bug to the calcite community yet. Our original
> plan was to report this bug to the calcite community during the process of
> doing this flip, and fix it separately in flink's own calcite file. Because
> the time for Calcite to release the version is uncertain. And the time to
> upgrade flink to the latest calcite version is also unknown.
> >
> >
> > The link to the poc code is at the bottom of the flip[2]. I'm post it
> here again[1].
> >
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
> > [2]
> https://github.com/apache/flink/compare/master...hackergin:flink:poc_named_argument
> >
> >
> >
> > --
> >
> >     Best！
> >     Xuyang
> >
> >
> >
> >
> >
> > 在 2023-12-20 13:31:26，"Benchao Li" <[email protected]> 写道：
> > >I didn't see your POC code, so I assumed that you'll need to add
> > >SqlStdOperatorTable#DEFAULT and
> > >SqlStdOperatorTable#ARGUMENT_ASSIGNMENT to FlinkSqlOperatorTable, am I
> > >right?
> > >
> > >If yes, this would enable many builtin functions to allow default and
> > >optional arguments, for example, `select md5(DEFAULT)`, I guess this
> > >is not what we want to support right? If so, I would suggest to throw
> > >proper errors for these unexpected usages.
> > >
> > >Benchao Li <[email protected]> 于2023年12月20日周三 13:16写道：
> > >>
> > >> Thanks Feng for driving this, it's a very useful feature.
> > >>
> > >> In the FLIP, you mentioned that
> > >> > During POC verification, bugs were discovered in Calcite that
> caused issues during the validation phase. We need to modify the
> SqlValidatorImpl and SqlToRelConverter to address these problems.
> > >>
> > >> Could you log bugs in Calcite, and reference the corresponding Jira
> > >> number in your code. We want to upgrade Calcite version to latest as
> > >> much as possible, and maintaining many bugfixes in Flink will add
> > >> additional burdens for upgrading Calcite. By adding corresponding
> > >> issue numbers, we can easily make sure that we can remove these Flink
> > >> hosted bugfixes when we upgrade to a version that already contains the
> > >> fix.
> > >>
> > >> Feng Jin <[email protected]> 于2023年12月14日周四 19:30写道：
> > >> >
> > >> > Hi Timo
> > >> > Thanks for your reply.
> > >> >
> > >> > >  1) ArgumentNames annotation
> > >> >
> > >> > I'm sorry for my incorrect expression. argumentNames is a method of
> > >> > FunctionHints. We should introduce a new arguments method to
> replace this
> > >> > method and return Argument[].
> > >> > I updated the FLIP doc about this part.
> > >> >
> > >> > >  2) Evolution of FunctionHint
> > >> >
> > >> > +1 define DataTypeHint as part of ArgumentHint. I'll update the
> FLIP doc.
> > >> >
> > >> > > 3)  Semantical correctness
> > >> >
> > >> > I realized that I forgot to submit the latest modification of the
> FLIP
> > >> > document. Xuyang and I had a prior discussion before starting this
> discuss.
> > >> > Let's restrict it to supporting only one eval() function, which will
> > >> > simplify the overall design.
> > >> >
> > >> > Therefore, I also concur with not permitting overloaded named
> parameters.
> > >> >
> > >> >
> > >> > Best,
> > >> > Feng
> > >> >
> > >> > On Thu, Dec 14, 2023 at 6:15 PM Timo Walther <[email protected]>
> wrote:
> > >> >
> > >> > > Hi Feng,
> > >> > >
> > >> > > thank you for proposing this FLIP. This nicely completes FLIP-65
> which
> > >> > > is great for usability.
> > >> > >
> > >> > > I read the FLIP and have some feedback:
> > >> > >
> > >> > >
> > >> > > 1) ArgumentNames annotation
> > >> > >
> > >> > >  > Deprecate the ArgumentNames annotation as it is not
> user-friendly for
> > >> > > specifying argument names with optional configuration.
> > >> > >
> > >> > > Which annotation does the FLIP reference here? I cannot find it
> in the
> > >> > > Flink code base.
> > >> > >
> > >> > > 2) Evolution of FunctionHint
> > >> > >
> > >> > > Introducing @ArgumentHint makes a lot of sense to me. However,
> using it
> > >> > > within @FunctionHint looks complex, because there is both
> `input=` and
> > >> > > `arguments=`. Ideally, the @DataTypeHint can be defined inline as
> part
> > >> > > of the @ArgumentHint. It could even be the `value` such that
> > >> > > `@ArgumentHint(@DataTypeHint("INT"))` is valid on its own.
> > >> > >
> > >> > > We could deprecate `input=`. Or let both `input` and `arguments=`
> > >> > > coexist but never be defined at the same time.
> > >> > >
> > >> > > 3) Semantical correctness
> > >> > >
> > >> > > As you can see in the `TypeInference` class, named parameters are
> > >> > > prepared in the stack already. However, we need to watch out
> between
> > >> > > helpful explanation (see
> `InputTypeStrategy#getExpectedSignatures`) and
> > >> > > named parameters (see `TypeInference.Builder#namedArguments`)
> that can
> > >> > > be used in SQL.
> > >> > >
> > >> > > If I remember correctly, named parameters can be reordered and
> don't
> > >> > > allow overloading of signatures. Thus, only a single eval()
> should have
> > >> > > named parameters. Looking at the FLIP it seems you would like to
> support
> > >> > > multiple parameter lists. What changes are you planning to
> TypeInference
> > >> > > (which is also declared as @PublicEvoving)? This should also be
> > >> > > documented as the annotations should compile into this class.
> > >> > >
> > >> > > In general, I would prefer to keep it simple and don't allow
> overloading
> > >> > > named parameters. With the optionality, users can add an arbitrary
> > >> > > number of parameters to the signature of the same eval method.
> > >> > >
> > >> > > Regards,
> > >> > > Timo
> > >> > >
> > >> > > On 14.12.23 10:02, Feng Jin wrote:
> > >> > > > Hi all,
> > >> > > >
> > >> > > >
> > >> > > > Xuyang and I would like to start a discussion of FLIP-387:
> Support
> > >> > > > named parameters for functions and call procedures [1]
> > >> > > >
> > >> > > > Currently, when users call a function or call a procedure, they
> must
> > >> > > > specify all fields in order. When there are a large number of
> > >> > > > parameters, it is easy to make mistakes and cannot omit
> specifying
> > >> > > > non-mandatory fields.
> > >> > > >
> > >> > > > By using named parameters, you can selectively specify the
> required
> > >> > > > parameters, reducing the probability of errors and making it
> more
> > >> > > > convenient to use.
> > >> > > >
> > >> > > > Here is an example of using Named Procedure.
> > >> > > > ```
> > >> > > > -- for scalar function
> > >> > > > SELECT my_scalar_function(param1 => ‘value1’, param2 =>
> ‘value2’’) FROM
> > >> > > []
> > >> > > >
> > >> > > > -- for table function
> > >> > > > SELECT  *  FROM TABLE(my_table_function(param1 => 'value1',
> param2 =>
> > >> > > 'value2'))
> > >> > > >
> > >> > > > -- for agg function
> > >> > > > SELECT my_agg_function(param1 => 'value1', param2 => 'value2')
> FROM []
> > >> > > >
> > >> > > > -- for call procedure
> > >> > > > CALL  procedure_name(param1 => ‘value1’, param2 => ‘value2’)
> > >> > > > ```
> > >> > > >
> > >> > > > For UDX and Call procedure developers, we introduce a new
> annotation
> > >> > > > to specify the parameter name, indicate if it is optional, and
> > >> > > > potentially support specifying default values in the future
> > >> > > >
> > >> > > > ```
> > >> > > > public @interface ArgumentHint {
> > >> > > >      /**
> > >> > > >       * The name of the parameter, default is an empty string.
> > >> > > >       */
> > >> > > >      String name() default "";
> > >> > > >
> > >> > > >      /**
> > >> > > >       * Whether the parameter is optional, default is true.
> > >> > > >       */
> > >> > > >      boolean isOptional() default true;
> > >> > > > }}
> > >> > > > ```
> > >> > > >
> > >> > > > ```
> > >> > > > // Call Procedure Development
> > >> > > >
> > >> > > > public static class NamedArgumentsProcedure implements
> Procedure {
> > >> > > >
> > >> > > >     // Example usage: CALL myNamedProcedure(in1 => 'value1',
> in2 =>
> > >> > > 'value2')
> > >> > > >
> > >> > > >     // Example usage: CALL myNamedProcedure(in1 => 'value1',
> in2 =>
> > >> > > > 'value2', in3 => 'value3')
> > >> > > >
> > >> > > >     @ProcedureHint(
> > >> > > >             input = {@DataTypeHint(value = "STRING"),
> > >> > > > @DataTypeHint(value = "STRING"), @DataTypeHint(value =
> "STRING")},
> > >> > > >             output = @DataTypeHint("STRING"),
> > >> > > >                  arguments = {
> > >> > > >                  @ArgumentHint(name = "in1", isOptional =
> false),
> > >> > > >                  @ArgumentHint(name = "in2", isOptional = true)
> > >> > > >                  @ArgumentHint(name = "in3", isOptional =
> true)})
> > >> > > >     public String[] call(ProcedureContext procedureContext,
> String
> > >> > > > arg1, String arg2, String arg3) {
> > >> > > >         return new String[]{arg1 + ", " + arg2 + "," + arg3};
> > >> > > >     }
> > >> > > > }
> > >> > > > ```
> > >> > > >
> > >> > > >
> > >> > > > Currently, we offer support for two scenarios when calling a
> function
> > >> > > > or procedure:
> > >> > > >
> > >> > > > 1. The corresponding parameters can be specified using the
> parameter
> > >> > > > name, without a specific order.
> > >> > > > 2. Unnecessary parameters can be omitted.
> > >> > > >
> > >> > > >
> > >> > > > There are still some limitations when using Named parameters:
> > >> > > > 1. Named parameters do not support variable arguments.
> > >> > > > 2. UDX or procedure classes that support named parameters can
> only
> > >> > > > have one eval method.
> > >> > > > 3. Due to the current limitations of Calcite-947[2], we cannot
> specify
> > >> > > > a default value for omitted parameters, which is Null by
> default.
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Also, thanks very much for the suggestions and help provided by
> Zelin
> > >> > > > and Lincoln.
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > 1.
> > >> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
> > >> > > .
> > >> > > >
> > >> > > > 2. https://issues.apache.org/jira/browse/CALCITE-947
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Best,
> > >> > > >
> > >> > > > Feng
> > >> > > >
> > >> > >
> > >> > >
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Best,
> > >> Benchao Li
> > >
> > >
> > >
> > >--
> > >
> > >Best,
> > >Benchao Li
>
>
>
> --
>
> Best,
> Benchao Li
>

Re: Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures

Reply via email to