[jira] [Commented] (FLINK-18235) Improve the checkpoint strategy for Python UDF execution

Dian Fu (Jira) Mon, 13 Feb 2023 01:12:06 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-18235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687822#comment-17687822
 ]


Dian Fu commented on FLINK-18235:
---------------------------------

[~pnowojski] Thanks a lot for the advice! This is also one of the earliest 
solutions which come in my heads. However, establishing the mapping 
relationship between the inputs and outputs is a little complicate for Python 
Operator. The difference between Python Operator and AsyncWaitOperator is that 
it may have _many-to-many_ relationship between the inputs and the outputs for 
Python operator. It may produce zero(filter), one(map) or multiple(flat map) 
results for one input element. It many also produce one result for multiple 
inputs(aggregation). 

> Improve the checkpoint strategy for Python UDF execution
> --------------------------------------------------------
>
>                 Key: FLINK-18235
>                 URL: https://issues.apache.org/jira/browse/FLINK-18235
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Python
>            Reporter: Dian Fu
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, stale-assigned
>
> Currently, when a checkpoint is triggered for the Python operator, all the 
> data buffered will be flushed to the Python worker to be processed. This will 
> increase the overall checkpoint time in case there are a lot of elements 
> buffered and Python UDF is slow. We should improve the checkpoint strategy to 
> improve this. One way to implement this is to control the number of data 
> buffered in the pipeline between Java/Python processes, similar to what 
> [FLIP-183|https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment]
>  does to control the number of data buffered in the network. We can also let 
> users to config the checkpoint strategy if needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-18235) Improve the checkpoint strategy for Python UDF execution

Reply via email to