[jira] [Updated] (FLINK-36703) FLIP-440: User-defined SQL operators / ProcessTableFunction (PTF)

Timo Walther (Jira) Thu, 06 Feb 2025 06:46:09 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-36703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Timo Walther updated FLINK-36703:
---------------------------------
    Description: 
Introduce a new kind of user-defined function (UDF) that enables implementing 
user-defined SQL operators: [ProcessTableFunction 
(PTF)|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093]

The new UDF kind opens the Table & SQL API towards capabilities of the 
DataStream API while staying in the SQL ecosystem. And using all benefits of it.

PTFs look and feel familiar for both someone coming from the DataStream API 
world as well as the SQL world.

>From SQL:
 * Similar types and type inference as ScalarFunction, AggregateFunction, or 
TableFunction
 * Registration in catalog, usage of inline/temporary UDF, built-in system 
functions in the future
 * Usage in both SQL and Table API
 * Very important: Standard-compliant syntax using Polymorphic Table Functions 

>From DataStream API:
 * Familiar naming like ProcessFunction
 * Access to Map, List and Value state
 * Ability to both keyBy() and connect() streams. Long-term also broadcast side 
functionality.
 * Ability to deal with watermarks and dealing with time
 * Support of query evolution (terminology as defined in FLIP-190)

 

Current implementation phases:
Phase 0: Single table input, no state, no timers, append-only
Phase 1: Value state
Phase 2: Descriptors
Phase 3: Time and timers
Phase 4: Changelog support
Phase 5: CoPartition for 2 inputs
Phase 6: Map and list state

  was:
Introduce a new kind of user-defined function (UDF) that enables implementing 
user-defined SQL operators: [ProcessTableFunction 
(PTF)|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093]

The new UDF kind opens the Table & SQL API towards capabilities of the 
DataStream API while staying in the SQL ecosystem. And using all benefits of it.

PTFs look and feel familiar for both someone coming from the DataStream API 
world as well as the SQL world.

>From SQL:
 * Similar types and type inference as ScalarFunction, AggregateFunction, or 
TableFunction
 * Registration in catalog, usage of inline/temporary UDF, built-in system 
functions in the future
 * Usage in both SQL and Table API
 * Very important: Standard-compliant syntax using Polymorphic Table Functions 

>From DataStream API:
 * Familiar naming like ProcessFunction
 * Access to Map, List and Value state
 * Ability to both keyBy() and connect() streams. Long-term also broadcast side 
functionality.
 * Ability to deal with watermarks and dealing with time
 * Support of query evolution (terminology as defined in FLIP-190)

 

Current implementation phases:
Phase 0: Single table input, no state, no timers, append-only
Phase 1: Value state
Phase 2: Time and timers
Phase 3: Changelog support
Phase 4: CoPartition for 2 inputs
Phase 5: Descriptors
Phase 6: Map and list state


> FLIP-440: User-defined SQL operators / ProcessTableFunction (PTF)
> -----------------------------------------------------------------
>
>                 Key: FLINK-36703
>                 URL: https://issues.apache.org/jira/browse/FLINK-36703
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API, Table SQL / Planner, Table SQL / Runtime
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>            Priority: Major
>
> Introduce a new kind of user-defined function (UDF) that enables implementing 
> user-defined SQL operators: [ProcessTableFunction 
> (PTF)|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093]
> The new UDF kind opens the Table & SQL API towards capabilities of the 
> DataStream API while staying in the SQL ecosystem. And using all benefits of 
> it.
> PTFs look and feel familiar for both someone coming from the DataStream API 
> world as well as the SQL world.
> From SQL:
>  * Similar types and type inference as ScalarFunction, AggregateFunction, or 
> TableFunction
>  * Registration in catalog, usage of inline/temporary UDF, built-in system 
> functions in the future
>  * Usage in both SQL and Table API
>  * Very important: Standard-compliant syntax using Polymorphic Table 
> Functions 
> From DataStream API:
>  * Familiar naming like ProcessFunction
>  * Access to Map, List and Value state
>  * Ability to both keyBy() and connect() streams. Long-term also broadcast 
> side functionality.
>  * Ability to deal with watermarks and dealing with time
>  * Support of query evolution (terminology as defined in FLIP-190)
>  
> Current implementation phases:
> Phase 0: Single table input, no state, no timers, append-only
> Phase 1: Value state
> Phase 2: Descriptors
> Phase 3: Time and timers
> Phase 4: Changelog support
> Phase 5: CoPartition for 2 inputs
> Phase 6: Map and list state



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-36703) FLIP-440: User-defined SQL operators / ProcessTableFunction (PTF)

Reply via email to