Re: Checkpoints issue and job failing

2020-01-06 Thread Navneeth Krishnan
Thanks Vino & Piotr, sure, will upgrade the flink version and monitor it to see if the problem still exist. Thanks On Mon, Jan 6, 2020 at 12:39 AM Piotr Nowojski wrote: > Hi, > > From the top of my head I don’t remember anything particular, however > release 1.4.0 came with quite a lot of deep

Re: Checkpoints issue and job failing

2020-01-06 Thread Piotr Nowojski
Hi, From the top of my head I don’t remember anything particular, however release 1.4.0 came with quite a lot of deep change which had it’s fair share number of bugs, that were subsequently fixed in later releases. Because 1.4.x tree is no longer supported I would strongly recommend to first

Re: Checkpoints issue and job failing

2020-01-05 Thread vino yang
Hi Navneeth, Since the file still exists, this exception is very strange. I want to ask, does it happen by accident or frequently? Another concern is that since the 1.4 version is very far away, all maintenance and response are not as timely as the recent versions. I personally recommend upgradi

Re: Checkpoints issue and job failing

2020-01-03 Thread Navneeth Krishnan
Thanks Congxian & Vino. Yes, the file do exist and I don't see any problem in accessing it. Regarding flink 1.9, we haven't migrated yet but we are planning to do. Since we have to test it might take sometime. Thanks On Fri, Jan 3, 2020 at 2:14 AM Congxian Qiu wrote: > Hi > > Do you have ever

Re: Checkpoints issue and job failing

2020-01-03 Thread Congxian Qiu
Hi Do you have ever check that this problem exists on Flink 1.9? Best, Congxian vino yang 于2020年1月3日周五 下午3:54写道: > Hi Navneeth, > > Did you check if the path contains in the exception is really can not be > found? > > Best, > Vino > > Navneeth Krishnan 于2020年1月3日周五 上午8:23写道: > >> Hi All, >>

Re: Checkpoints issue and job failing

2020-01-02 Thread vino yang
Hi Navneeth, Did you check if the path contains in the exception is really can not be found? Best, Vino Navneeth Krishnan 于2020年1月3日周五 上午8:23写道: > Hi All, > > We are running into checkpoint timeout issue more frequently in production > and we also see the below exception. We are running flink

Checkpoints issue and job failing

2020-01-02 Thread Navneeth Krishnan
Hi All, We are running into checkpoint timeout issue more frequently in production and we also see the below exception. We are running flink 1.4.0 and the checkpoints are saved on NFS. Can someone suggest how to overcome this? [image: image.png] java.lang.IllegalStateException: Could not initial