Hi, 

We have a pipeline which internally uses Java POJOs and also needs to keep some 
events entirely in state f or some time . 

>From time to time, our POJOs evolve, like attributes are added or removed. 

Now I wanted to write a E2E test that proves the schema migration works (Having 
different schemas in source kafka topic, flink pipeline state and sink) for 
bounded scenarios (attribute added or removed) 

I figured out that in my test, I can instantiate a 
MiniClusterWithClientResource, receive a client, start a job over the client 
and also cancel the job with a savepoint. My idea was to start the job, put 
some records in, cancel with a savepoint and restart the job from savepoint, 
but with a slightly different POJO (added another attribute and removed an 
existing one). 

Currently, I'm sadly missing two pieces: 
1. I don't see a way to restart a job from savepoint via the client obtained 
from the MiniClusterWithClientResource in my test 
2. According to a flink blog post [1],schema evolution of POJOs is more 
limited, especially the evolved POJO must have the same "nampesacpe" (i.e. java 
package?!) and class name. 

Especially point 2 seems to make it impossible for me to automate testing of 
the evolution, but need to do it manually. 

Do you have any idea how I could overcome these limitations so that I can build 
a proper end to end test for the schema migration to work? 

Best regards 
Theo 

[1] 
https://flink.apache.org/news/2020/01/29/state-unlocked-interacting-with-state-in-apache-flink.html
 

Reply via email to