I'd second Chesnay's suggestion to use a custom source. It would be a piece of cake with FLIP-27 [1], but we are not there yet unfortunately. It's probably in Flink 1.11 (mid year) if you can wait.
The current way would be a source that wraps the two KafkaConsumer and blocks the normal consumer from outputting elements. Here is a quick and dirty solution that I threw together: https://gist.github.com/AHeise/d7a8662f091e5a135c5ccfd6630634dd . [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface On Mon, Jan 6, 2020 at 1:16 PM David Morin <morin.david....@gmail.com> wrote: > My naive solution can't work because a dump can be quite long. > So, yes I have to find a way to stop the consumption from the topic used > for streaming mode when a dump is done :( > Terry, I try to implement something based on your reply and based on this > thread > https://stackoverflow.com/questions/59201503/flink-kafka-gracefully-close-flink-consuming-messages-from-kafka-source-after-a > Any suggestions are welcomed > thx. > > David > > On 2020/01/06 09:35:37, David Morin <morin.david....@gmail.com> wrote: > > Hi, > > > > Thanks for your replies. > > Yes Terry. You are right. I can try to create a custom source. > > But perhaps, according to my use case, I figured out I can use a > technical field in my data. This is a timestamp and I think I just have to > ignore late events with watermarks or later in the pipeline according to > metadata stored in the Flink state. I test it now... > > Thx > > > > David > > > > On 2020/01/03 15:44:08, Chesnay Schepler <ches...@apache.org> wrote: > > > Are you asking how to detect from within the job whether the dump is > > > complete, or how to combine these 2 jobs? > > > > > > If you had a way to notice whether the dump is complete, then I would > > > suggest to create a custom source that wraps 2 kafka sources, and > switch > > > between them at will based on your conditions. > > > > > > > > > On 03/01/2020 03:53, Terry Wang wrote: > > > > Hi, > > > > > > > > I’d like to share my opinion here. It seems that you need adjust the > Kafka consumer to have communication each other. When your begin the dump > process, you need to notify another CDC-topic consumer to wait idle. > > > > > > > > > > > > Best, > > > > Terry Wang > > > > > > > > > > > > > > > >> 2020年1月2日 16:49,David Morin <morin.david....@gmail.com> 写道: > > > >> > > > >> Hi, > > > >> > > > >> Is there a way to stop temporarily to consume one kafka source in > streaming mode ? > > > >> Use case: I have to consume 2 topics but in fact one of them is > more prioritized. > > > >> One of this topic is dedicated to ingest data from db (change data > capture) and one of them is dedicated to make a synchronization (a dump > i.e. a SELECT ... from db). At the moment the last one is performed by one > Flink job and we start this one after stop the previous one (CDC) manually > > > >> I want to merge these 2 modes and automatically stop consumption of > the topic dedicated to the CDC mode when a dump is done. > > > >> How to handle that with Flink in a streaming way ? backpressure ? > ... > > > >> Thx in advance for your insights > > > >> > > > >> David > > > > > > > > > > > > >