Re: Proper way to do Integration Testing ?

Timo Walther Tue, 11 Aug 2020 01:46:04 -0700

Hi Faye,

Flink does not officially provide testing tools at the moment. However,you can use internal Flink tools if they solve your problem.

The `flink-end-to-end-tests` module [1] shows some examples how we testFlink together with other systems. Many tests are still using plain bashscripts (in the `test-scripts` folder). The newer generation uses a testbase like StreamingKafkaITCase [2] or SQLClientKafkaITCase [3].

Alternatively, you can also replace the connectors with testingconnectors and just run integration tests for your pipeline. Like we doit in StreamTableEnvironmentITCase [4].


[1] https://github.com/apache/flink/tree/master/flink-end-to-end-tests

[2]https://github.com/apache/flink/blob/master/flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/src/test/java/org/apache/flink/tests/util/kafka/StreamingKafkaITCase.java[3]https://github.com/apache/flink/blob/master/flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/src/test/java/org/apache/flink/tests/util/kafka/SQLClientKafkaITCase.java[4]https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/runtime/stream/sql/StreamTableEnvironmentITCase.scala


Regards,
Timo


On 10.08.20 21:46, Faye Pressly wrote:

Hello!

I have built a Flink pipeline that involve

1) Reading events from a Kinesis Stream
2) Create two DataStream (using .filter) with events of type 'A' goingin stream1 and event of type 'B' going in stream2
3) Transform stream1 into a Table and use Table API to do a simplewindow tumble and group by to counts the events
4) Interval join stream1 with stream2 in order to filter out some eventin stream1 that are not in stream2
5) Transform the result of the interval join into a table and use TableAPI to do a simple Tumble Window Group by to count the events
6) Join 3) and 5) and transform back to a stream that sinks to an outputkinesis stream
I have read the documentation that shows some examples of Unit Testingbut I'm scratching my end to know how I'm going to be able to IT test mypipeline to make sure all the computation are correct given an exactinput dataset?
Is there a proper way of writing IT to test my pipepleine?
Or will have I have to bring up a Flink cluster (with docker forexample, fire events with a python scripts and then check the results byreading the output stream?
Thank you!

Re: Proper way to do Integration Testing ?

Reply via email to