Hi, yes, that is correct. The failure mapper is there to cause a failover event for which we can then check i) that exactly-once or at-least-once is not violated, depending on the expected semantics and ii) that the restore works at all ;-). You might be able to reuse org.apache.flink.streaming.tests.FailureMapper for this. For the future, it would surely also be nice to have a test that covers rescaling as well, but for now just having any real test is already a great improvement.
Best, Stefan > On 12. Nov 2018, at 05:23, Thomas Weise <t...@apache.org> wrote: > > Hi Stefan, > > Thanks for the info. So if I understand correctly, the pipeline you had in > mind is: > > Consumer -> Map -> Producer > > What do you expect as outcome of the mapper failure? That no records are > lost but some possibly duplicated in the sink? > > Regarding the abstraction, I will see what I can do in that regard. From > where I start it may make more sense to do some of that as follow-up when > the Kafka test is ported. > > Thanks, > Thomas > > > On Thu, Nov 8, 2018 at 10:20 AM Stefan Richter <s.rich...@data-artisans.com> > wrote: > >> Hi, >> >> I was also just planning to work on it before Stephan contacted Thomas to >> ask about this test. >> >> Thomas, you are right about the structure, the test should also go into >> the `run-nightly-tests.sh`. What I was planning to do is a simple job that >> consists of a Kinesis consumer, a mapper that fails once after n records, >> and a kinesis producer. I was hoping that creation, filling, and validation >> of the Kinesis topics can be done with the Java API, not by invoking >> commands in a bash script. In general I would try to minimise the amount of >> scripting and do as much in Java as possible. It would also be nice if the >> test was generalised, e.g. that abstract Producer/Consumer are created from >> a Supplier and also the validation is done over some abstraction that lets >> us iterate over the produced output. Ideally, this would be a test that we >> can reuse for all Consumer/Producer cases and we could also port the tests >> for Kafka to that. What do you think? >> >> Best, >> Stefan >> >>> On 8. Nov 2018, at 07:22, Tzu-Li (Gordon) Tai <tzuli...@apache.org> >> wrote: >>> >>> Hi Thomas, >>> >>> I think Stefan Richter is also working on the Kinesis end-to-end test, >> and >>> seems to be planning to implement it against a real Kinesis service >> instead >>> of Kinesalite. >>> Perhaps efforts should be synced here. >>> >>> Cheers, >>> Gordon >>> >>> >>> On Thu, Nov 8, 2018 at 1:38 PM Thomas Weise <t...@apache.org> wrote: >>> >>>> Hi, >>>> >>>> I'm planning to add an end-to-end test for the Kinesis consumer. We have >>>> done something similar at Lyft, using Kinesalite, which can be run as >>>> Docker container. >>>> >>>> I see that some tests already make use of Docker, so we can assume it >> to be >>>> present in the target environment(s)? >>>> >>>> I also found the following ticket: >>>> https://issues.apache.org/jira/browse/FLINK-9007 >>>> >>>> It suggest to also cover the producer, which may be a good way to create >>>> the input data as well. The stream itself can be created with the >> Kinesis >>>> Java SDK. >>>> >>>> Following the existing layout, there would be a new module >>>> flink-end-to-end-tests/flink-kinesis-test >>>> >>>> Are there any suggestions or comments regarding this? >>>> >>>> Thanks, >>>> Thomas >>>> >> >>