Hi, Timo.

Thank you for this great work! When I previously introduced the session window 
TVF, I was contemplating 

how to enable users to define a PTF in SQL. I'm glad to see this work being 
discussed and that it has 

improved the integration with the DataStream API. 

After reading the entire flip, I have a few questions that I hope you can 
address.

1. I noticed that in the example, the same field (e.g., CountState) can declare 
a StateHint in the eval, onTimer, 

and finish methods. What happens if the TTLs for these different StateHints are 
not the same?

2. I believe the named arguments introduced in FLIP-387[1] can also be applied 
to this ProcessTableFunction, right?

3. In our UDAFs, we expect users to provide accumulate and retract methods to 
handle input data for +I/+U and -U/-D. 

However, in the eval method of a ScalarFunction/UDTF, users do not have 
visibility into the input's RowKind. In the new PTF, 

will we expose the original RowKind in the eval method's row input, allowing 
users to determine the row's RowKind themselves?

4. I noticed that in the examples, the eval method sometimes includes the 
Context, @StateHint fields, and the input data (Row 

input), while other times it only consists of the input data. Are we allowing 
users to define both styles simultaneously?




[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures







--

    Best!
    Xuyang





At 2024-10-31 21:57:37, "Timo Walther" <twal...@apache.org> wrote:
>Hi everyone,
>
>thanks for all the feedback I received so far. I had very healthy 
>discussions with various people both online and offline at Current and 
>Flink Forward Berlin. The general user responses were also very 
>positive. The FLIP should be ready to start a VOTE thread.
>
>This is the last call for feedback. I would start a VOTE tomorrow if 
>there are no objections. Happy to take further feedback during 
>implementation as well.
>
>Thanks,
>Timo
>
>On 30.10.24 14:34, Timo Walther wrote:
>> Hi Jim,
>> 
>> 3. Multiple output tables
>> 
>>  > Does the target_table need to be specified in the SELECT clause?
>> 
>> No. Similar to reading from a regular table. The filter column must not 
>> be part of SELECT part.
>> 
>>  > It seems like the two target_table could have separate schemas defined.
>> 
>> That is true. The SELECT is responsible to transforms the columns into 
>> the target table's schema. The output row of the PTF might be a union of 
>> various columns in this case.
>> 
>> 10. Support for State TTL
>> 
>>  > I'd be strongly in favor of doing any interface / base work we need in
>>  > the initial implementation so that state size can be managed.
>> 
>> I agree, State TTL is crucial. I updated the FLIP and added interfaces 
>> to StateTypeStrategy and @StateHint.
>> 
>> Cheers,
>> Timo
>> 
>> 
>> 
>> On 23.10.24 17:59, Jim Hughes wrote:
>>> Hi Timo,
>>>
>>> Thank you for the answers.  I have a few clarifications inlined.
>>>
>>> On Mon, Oct 14, 2024 at 8:07 AM Timo Walther <twal...@apache.org> wrote:
>>>
>>>> 3. Change of interfaces for multiple output tables
>>>> Currently, I think using a STATEMENT SET should be enough for side
>>>> output semantics. I have added an example in section 5.2.3.2 for that.
>>>> We are still free to add more methods to Context, let the function
>>>> implement additional interfaces or use more code generation together
>>>> with @ArgumentHints.
>>>>
>>>
>>> Does the target_table need to be specified in the SELECT clause?  Or 
>>> could
>>> it read
>>>
>>> EXECUTE STATEMENT SET BEGIN
>>>     INSERT INTO main SELECT a, b FROM FunctionWithSideOutput(input => 
>>> data,
>>> uid = 'only_once') WHERE target_table = 'main';
>>>     INSERT INTO side SELECT a, b FROM FunctionWithSideOutput(input => 
>>> data,
>>> uid = 'only_once') WHERE target_table = 'side';
>>> END;
>>>
>>> Separately, for clarity, it seems like the two target_table could have
>>> separate schemas defined.
>>>
>>>
>>>> 10. Support for State TTL
>>>> Supporting state TTL will be easy. We just need to add a parameter to
>>>> @StateHint and pass it through.
>>>>
>>>
>>> If PTFs can have state, I'd be strongly in favor of doing any interface /
>>> base work we need in the initial implementation so that state size can be
>>> managed.  If it is just sufficient to have hints in the interface, 
>>> awesome!
>>>
>>> Cheers,
>>>
>>> Jim
>>>
>> 

Reply via email to