Attached is a log file from a taskmanager. Please take a look at the log file considering the below events: - Around 01:10:47 : the job is submitted to the job manager. - Around 01:16:30 : suddenly source starts to read from and sink starts to write data to Kafka
Any help would be greatly appreciated! T.T Best, - Dongwon
tm.log
Description: Binary data
> 2018. 4. 2. 오후 2:30, Dongwon Kim <eastcirc...@gmail.com> 작성: > > Hi, > > While restoring from the latest checkpoint starts immediately after the job > is restarted, restoring from a savepoint takes more than five minutes until > the job makes progress. > During the blackout, I cannot observe any resource usage over the cluster. > After that period of time, I observe that Flink tries to catch up with the > progress in the source topic via various metrics including > flink_taskmanager_job_task_currentLowWatermark. > > FYI, I'm using > - Flink-1.4.2 > - FsStateBackend configured with HDFS > - EventTime with BoundedOutOfOrdernessTimestampExtractor > > The size of an instance of checkpoint/savepoint is ~50GB and we have 7 > servers for taskmanagers. > > Best, > > - Dongwon