[ 
https://issues.apache.org/jira/browse/FLINK-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757282#comment-16757282
 ] 

Kezhu Wang commented on FLINK-11409:
------------------------------------

[~aljoscha] [~dawidwys] I would like to present example code for this 
discussion.

 
{code:java}
public abstract class AbstractFlinkRichFunction<T extend Action> extends 
AbstractRichFunction implements CheckpointedFunction {
   private final OperatorInfo operatorInfo;

   protected transient T action;

    @Override
    public void open(Configuration parameters) throws Exception {
        super.open(parameters);
        // Open target operator action
    }

    @Override
    public void close() throws Exception {
       // Close target operator action
        super.close();
    }

    @Override
    public void snapshotState(FunctionSnapshotContext snapshotContext) throws 
Exception {
       // Relay snapshot to target operator action
    }

    @Override
    public void initializeState(FunctionInitializationContext 
initializationContext) throws Exception {
       // Create operator action base on <T> and operator info
       // Relay initializeState to target operator action
    }
}

public class FlinkFlatMapFunction extends AbstractFlinkRichFunction<T> 
implements FlatMapFunction<Event, Event> {
    @Override
    public void flatMap(Event value, Collector<Event> out) throws Exception {
       // Relay flatMap to target operator action
    }
}
{code}
 

In above code, `AbstractFlinkRichFunction` focuses on lifecycle management, 
while `FlinkXyzFunction` focuses on data processing. This pattern works fine 
for `MapFunction`, `FilterFunction`, `SourceFunction` and others. But for 
`ProcessFunction` and etc., we have to duplicate `AbstractFlinkRichFunction` as 
these function callbacks are implemented as abstract classes. *Due to Java's 
single class inheritance, I think exporting _callback like apis_ as classes not 
interfaces is intrusive and unfriendly to caller.*

Besides this, from api perspective, I think making `ProcessFunction` and etc. 
as subclass of `AbstractRichFunction` mixes up data processing function and 
lifecycle management.

 

> Make `ProcessFunction`, `ProcessWindowFunction` and etc. pure interfaces
> ------------------------------------------------------------------------
>
>                 Key: FLINK-11409
>                 URL: https://issues.apache.org/jira/browse/FLINK-11409
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataStream API
>            Reporter: Kezhu Wang
>            Priority: Major
>              Labels: Breaking-Change
>
> I found these functions express no opinionated demands from implementing 
> classes. It would be nice to implement as interfaces not abstract classes as 
> abstract class is intrusive and hampers caller user cases. For example, 
> client can't write an `AbstractFlinkRichFunction` to unify lifecycle 
> management for all data processing functions in easy way.
> I dive history of some of these functions, and find that some functions were 
> converted as abstract class from interface due to default method 
> implementation, such as `ProcessFunction` and `CoProcessFunction` were 
> converted to abstract classes in FLINK-4460 which predate -FLINK-7242-. After 
> -FLINK-7242-, [Java 8 default 
> method|https://docs.oracle.com/javase/tutorial/java/IandI/defaultmethods.html]
>  would be a better solution.
> I notice also that some functions which are introduced after -FLINK-7242-, 
> such as `ProcessJoinFunction`, are implemented as abstract classes. I think 
> it would be better to establish a well-known principle to guide both api 
> authors and callers of data processing functions.
> Personally, I prefer interface for all exported function callbacks for the 
> reason I express in first paragraph.
> Besides this, with `AbstractRichFunction` and interfaces for data processing 
> functions I think lots of rich data processing functions can be eliminated as 
> they are plain classes extending `AbstractRichFunction` and implementing data 
> processing interfaces, clients can write this in one line code with clear 
> intention of both data processing and lifecycle management.
> Following is a possible incomplete list of data processing functions 
> implemented as abstract classes currently:
>  * `ProcessFunction`, `KeyedProcessFunction`, `CoProcessFunction` and 
> `ProcessJoinFunction`
>  * `ProcessWindowFunction` and `ProcessAllWindowFunction`
>  * `BaseBroadcastProcessFunction`, `BroadcastProcessFunction` and 
> `KeyedBroadcastProcessFunction`
> All above functions are annotated with `@PublicEvolving`, making they 
> interfaces won't break Flink's compatibility guarantee but compatibility is 
> still a big consideration to evaluate this proposal.
> Any thoughts on this proposal ? Please must comment out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to