Hi Guowei, Thanks for quick response, maybe I didn't express it clearly in the last email. In fact, above case happened in reality, not what I imagined. When MAX_WATERMARK is received, the operator will try to fire all registered event-time timers. However in the above case, new timers are continuous being registered. I would try to reproduce the problem in an ITCase, and once completed I would provide the code.
Best, JING ZHANG Guowei Ma <guowei....@gmail.com> 于2021年9月24日周五 下午5:16写道: > Hi, JING > > Thanks for the case. > But I am not sure this would happen. As far as I know the event timer > could only be triggered when there is a watermark (except the "quiesce > phase"). > I think it could not advance any watermarks after MAX_WATERMARK is > received. > > Best, > Guowei > > > On Fri, Sep 24, 2021 at 4:31 PM JING ZHANG <beyond1...@gmail.com> wrote: > >> Hi Guowei, >> I could provide a case that I have encountered which timers to fire >> indefinitely when doing drain savepoint. >> After an event timer is triggered, it registers another event timer >> whose value equals the value of triggered timer plus an interval time. >> If a MAX_WATERMARK comes, the timer is triggered, then registers another >> timer and forever. >> I'm not sure whether Macro meets a similar problem. >> >> Best, >> JING ZHANG >> >> >> >> Guowei Ma <guowei....@gmail.com> 于2021年9月24日周五 下午4:01写道: >> >>> Hi Macro >>> >>> Indeed, as mentioned by JING, if you want to drain when triggering >>> savepoint, you will encounter this MAX_WATERMARK. >>> But I have a problem. In theory, even with MAX_WATERMARK, there will not >>> be an infinite number of timers. And these timers should be generated by >>> the application code. >>> You can share your code if it is convenient for you. >>> >>> Best, >>> Guowei >>> >>> >>> On Fri, Sep 24, 2021 at 2:02 PM JING ZHANG <beyond1...@gmail.com> wrote: >>> >>>> Hi Macro, >>>> Do you specified drain flag when stop a job with a savepoint? >>>> If the --drain flag is specified, then a MAX_WATERMARK will be emitted >>>> before the last checkpoint barrier. >>>> >>>> [1] >>>> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint >>>> >>>> Best, >>>> JING ZHANG >>>> >>>> Marco Villalobos <mvillalo...@kineteque.com> 于2021年9月24日周五 下午12:54写道: >>>> >>>>> Something strange happened today. >>>>> When we tried to shutdown a job with a savepoint, the watermarks >>>>> became equal to 2^63 - 1. >>>>> >>>>> This caused timers to fire indefinitely and crash downstream systems >>>>> with overloaded untrue data. >>>>> >>>>> We are using event time processing with Kafka as our source. >>>>> >>>>> It seems impossible for a watermark to be that large. >>>>> >>>>> I know its possible stream with a batch execution mode. But this was >>>>> stream processing. >>>>> >>>>> What can cause this? Is this normal behavior when creating a >>>>> savepoint? >>>>> >>>>