It will work if you assign a new uid to the Kafka source. Gyula
On Fri, Jul 14, 2017, 18:42 Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > One thing: do note that `FlinkKafkaConsumer#setStartFromLatest()` does not > have any effect when starting from savepoints. > i.e., the consumer will still start from whatever offset is written in the > savepoint. > > > On 15 July 2017 at 12:38:10 AM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) > wrote: > > Can you try starting from the savepoint, but telling Kafka to start from > the latest offset? > > > (@gordon: Is that possible in Flink 1.3.1 or only in 1.4-SNAPSHOT ?) > > This is already possible in Flink 1.3.x. > `FlinkKafkaConsumer#setStartFromLatest()` would be it. > > On 15 July 2017 at 12:33:53 AM, Stephan Ewen (se...@apache.org) wrote: > > Can you try starting from the savepoint, but telling Kafka to start from > the latest offset? > > (@gordon: Is that possible in Flink 1.3.1 or only in 1.4-SNAPSHOT ?) > > On Fri, Jul 14, 2017 at 11:18 AM, Kien Truong <duckientru...@gmail.com> > wrote: > >> Hi, >> >> Sorry for the version typo, I'm running 1.3.1. I did not test with 1.2.x. >> >> The jobs runs fine with almost 0 back-pressure if it's started from >> scratch or if I reuse the kafka consumers group id without specifying the >> safe point. >> >> Best regards, >> Kien >> On Jul 14, 2017, at 15:59, Stephan Ewen <se...@apache.org> wrote: >>> >>> Hi! >>> >>> Flink 1.3.2 does not yet exist. Do you mean 1.3.1 or latest master? >>> >>> Can you tell us whether this occurs only in 1.3.x and worked well in >>> 1.2.x? >>> If you just keep the job running without savepoint/restore, you do not >>> get into backpressure situations? >>> >>> Thanks, >>> Stephan >>> >>> >>> On Fri, Jul 14, 2017 at 1:15 AM, Kien Truong <duckientru...@gmail.com> >>> wrote: >>> >>>> Hi Fabian, >>>> This happens to me even when the restore is immediate, so there's not >>>> much data in Kafka to catch up (5 minutes max) >>>> >>>> Regards >>>> Kien >>>> On Jul 13, 2017, at 23:40, Fabian Hueske < fhue...@gmail.com> wrote: >>>>> >>>>> I would guess that this is quite usual because the job has to >>>>> "catch-up" work. >>>>> For example, if you took a save point two days ago and restore the job >>>>> today, the input data of the last two days has been written to Kafka >>>>> (assuming Kafka as source) and needs to be processed. >>>>> The job will now read as fast as possible from Kafka to catch-up to >>>>> the presence. This means the data is much fast ingested (as fast as Kafka >>>>> can read and ship it) than during regular processing (as fast as your >>>>> sources produce). >>>>> The processing speed is bound by your Flink job which means there will >>>>> be backpressure. >>>>> >>>>> Once the job caught-up, the backpressure should disappear. >>>>> >>>>> Best, Fabian >>>>> >>>>> 2017-07-13 15:48 GMT+02:00 Kien Truong <duckientru...@gmail.com>: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I have one job where back-pressure is significantly higher after >>>>>> resuming from a save point. >>>>>> >>>>>> Because that job makes heavy use of stateful functions with >>>>>> RocksDBStateBackend , >>>>>> >>>>>> I'm suspecting that this is the cause of performance degradation. >>>>>> >>>>>> Does anyone encounter simillar issues or have any tips for debugging ? >>>>>> >>>>>> >>>>>> I'm using Flink 1.3.2 with YARN in detached mode. >>>>>> >>>>>> >>>>>> Regards, >>>>>> >>>>>> Kien >>>>>> >>>>>> >>>>> >>> >