After digging more in the log, I think it's more a bug. I've greped a log
by job id and found under normal circumstances TM supposed to delete
flink-io files. For some reason, it doesn't delete files that were listed

2018-10-08 22:10:25,865 INFO
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  -
Deleting existing instance base directory
2018-10-08 22:10:25,867 INFO
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  -
Deleting existing instance base directory
2018-10-08 22:10:25,874 INFO
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  -
Deleting existing instance base directory
2018-10-08 22:17:38,680 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Close
JobManager connection for job a5b223c7aee89845f9aed24012e46b7e.
org.apache.flink.util.FlinkException: JobManager responsible for
a5b223c7aee89845f9aed24012e46b7e lost the leadership.
org.apache.flink.util.FlinkException: JobManager responsible for
a5b223c7aee89845f9aed24012e46b7e lost the leadership.
2018-10-08 22:17:38,686 INFO
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  -
Deleting existing instance base directory
org.apache.flink.util.FlinkException: JobManager responsible for
a5b223c7aee89845f9aed24012e46b7e lost the leadership.
2018-10-08 22:17:38,691 INFO
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  -
Deleting existing instance base directory
org.apache.flink.util.FlinkException: JobManager responsible for
a5b223c7aee89845f9aed24012e46b7e lost the leadership.
org.apache.flink.util.FlinkException: JobManager responsible for
a5b223c7aee89845f9aed24012e46b7e lost the leadership.
org.apache.flink.util.FlinkException: JobManager responsible for
a5b223c7aee89845f9aed24012e46b7e lost the leadership.

On Tue, Oct 9, 2018 at 2:33 PM Sayat Satybaldiyev <> wrote:

> Dear all,
> While running Flink 1.6.1 with RocksDB as a backend and hdfs as
> checkpoint FS, I've noticed that after a job has moved to a different host
> it leaves quite a huge state in temp folder(1.2TB in total). The files are
> not used as TM is not running a job on the current host.
> The job a5b223c7aee89845f9aed24012e46b7e had been running on the host but
> then it was moved to a different TM. I'm wondering is it intended
> behavior or a possible bug?
> I've attached files that are left and not used by a job in PrintScreen.

Reply via email to