Hello,

I am looking to do https://issues.apache.org/jira/browse/KAFKA-3209. I wanted 
feedback from the devs for the design that I’m proposing to put in place. 
Thanks a lot for all the discussions Ewen Cheslack-Postava.

A gist of how I plan to do it is by using ‘Transformers’ that can be 
configurationally chained together and data will pass through them between a 
source and destination for Kafka Connect.

To set up transformers, we propose using the properties to define Transformer 
classes one after the other. 
transformer=abc.Transformer1,xyz.Transformer2

Each transformer can get specific properties passed on from the same properties 
file, as it is with the Connectors.

About the actual signature for the transformation function that does all the 
work, how’s this interface? 
public abstract class Transformer<T1, T2> {
    public abstract T2 transform(T1 t1);

    public void initialize(Map<String, String> props) {}
}

Approach 1:
Functionally, the complete data can be passed. 
Just as the *Tasks get a complete List<*Record>, the transformer can get the 
same. The whole list passing makes rearranging or merging data possible. This 
can be helpful if transformations require looking up or down the messages. 
Allowing custom datatypes between transformers will allow custom objects to be 
passed around intermediate. Casting could be an issue.

Approach 2: 
Taking a simplistic approach and doing a message by message transformation. The 
transformer could store data from the previous message, but not go down the 
list of messages. From the comments by Michael Graff, both approaches would 
work, but if down looking is required, we would have to go with Approach 1. 

I will also have a working change ready for Approach 1 very soon but till then, 
please give me your suggestions. 

Thanks,
Nisarg.




Reply via email to