Aaron Levin created FLINK-14110: ----------------------------------- Summary: Deleting state.backend.rocksdb.localdir causes silent failure Key: FLINK-14110 URL: https://issues.apache.org/jira/browse/FLINK-14110 Project: Flink Issue Type: Bug Components: Runtime / State Backends Affects Versions: 1.9.0, 1.8.1 Environment: Flink {{1.8.1}} and {{1.9.0}}.
JVM 8 Reporter: Aaron Levin Suppose {{state.backend.rocksdb.localdir}} is configured as: {noformat} state.backend.rocksdb.localdir: /flink/tmp {noformat} If I then run{{rm -rf /flink/tmp/job_*}} on a host while a Flink application is running, I will observe the following: * throughput of my operators running on that host will drop to zero * the application will not fail or restart * the task manager will not fail or restart * in most cases there is nothing in the logs to indicate a failure (I've run this several times and only once seen an exception - I believe I was lucky and deleted those directories during a checkpoint or something) The desired behaviour here would be to throw an exception and crash, instead of silently dropping throughput to zero. Restarting the Task Manager will resolve the issues. I only tried this on Flink {{1.8.1}} and {{1.9.0}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)