Hi David,

I started working on FLIP-238 exactly with the concerns you've mentioned in
mind. It is currently in development, feel free to join the discussion [1].
If you need something ASAP and are not interested in rate-limiting
functionality, you could drop in this [2] class into your tests suite (this
version is standalone and does not require changes in any other classes).
The usage is as indicated in the FLIP (minus sourceRatePerSecond parameter)
[3].

[1] https://lists.apache.org/thread/7gjxto1rmkpff4kl54j8nlg5db2rqhkt
[2]
https://github.com/afedulov/flink/blob/FLINK-27919-generator-source/flink-core/src/main/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceV0.java
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-238%3A+Introduce+FLIP-27-based+Data+Generator+Source#:~:text=%7D-,Usage%3A%C2%A0,-The%20envisioned%20usage

Best,
Alexander Fedulov


On Mon, Jul 4, 2022 at 12:51 PM Chesnay Schepler <ches...@apache.org> wrote:

> It is indeed not easy to mock sources/sink with the new interfaces.
>
> There is an effort to make this easier for sources in the future (FLIP-238
> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-238%3A+Introduce+FLIP-27-based+Data+Generator+Source>
> ).
>
> For the time being I'd stick with the old APIs for mock sources/sinks.
>
> On 04/07/2022 10:23, David Jost wrote:
>
> Hi,
>
> we are currently looking at replacing our sinks and sources with the 
> respective counterparts using the 'new' data source/sink API (mainly Kafka). 
> What holds us back is that we are not sure how to test the pipeline with 
> mocked sources/sinks. Up till now, we somewhat followed the 'Testing docs'[0] 
> and created a simple SinkFunction, as well as a ParallelSourceFunction, where 
> we could get data in and out at our leisure. They could be easily plugged 
> into the pipeline for the tests. But with the new API, it seems way too 
> cumbersome to go such an approach, as there is a lot of overhead in creating 
> a sink or source on your own (now).
>
> I would love to know, what the intended or recommended way is here. I know, 
> that I can still use the old API, but that a) feels wrong, and b) requires us 
> to expose the DataStream, which is not necessary in the current setup.
>
> I appreciate any ideas or even examples on this.
>
> Thank you in advance.
>
> Best
>   David
>
>
> [0]: 
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/testing/#testing-flink-jobs
>
>
>

Reply via email to