[ https://issues.apache.org/jira/browse/FLINK-12296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823988#comment-16823988 ]
Yu Li commented on FLINK-12296: ------------------------------- Thanks for filing the JIRA [~klion26]. Is this some regression issue or long existing one? Please fill the affected version field after confirmation. And since data loss is a critical issue, I'd suggest to escalate the priority from Major to Critical. Thanks. > Data loss silently in RocksDBStateBackend when more than one operator(has > states) chained in a single task > ----------------------------------------------------------------------------------------------------------- > > Key: FLINK-12296 > URL: https://issues.apache.org/jira/browse/FLINK-12296 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Reporter: Congxian Qiu(klion26) > Assignee: Congxian Qiu(klion26) > Priority: Major > > As the mail list said[1], there may be a problem when more than one operator > chained in a single task, and all the operators have states, we'll encounter > data loss silently problem. > Currently, the local directory we used is like below > ../local_state_root_1/allocation_id/job_id/vertex_id_subtask_idx/chk_1/(state), > > if more than one operator chained in a single task, and all the operators > have states, then all the operators will share the same local > directory(because the vertext_id is the same), this will lead a data loss > problem. > > The path generation logic is below: > {code:java} > // LocalRecoveryDirectoryProviderImpl.java > @Override > public File subtaskSpecificCheckpointDirectory(long checkpointId) { > return new File(subtaskBaseDirectory(checkpointId), > checkpointDirString(checkpointId)); > } > @VisibleForTesting > String subtaskDirString() { > return Paths.get("jid_" + jobID, "vtx_" + jobVertexID + "_sti_" + > subtaskIndex).toString(); > } > @VisibleForTesting > String checkpointDirString(long checkpointId) { > return "chk_" + checkpointId; > } > {code} > [1] > [https://app.smartmailcloud.com/web-share/MDkE4DArUT2eoSv86xq772I1HDgMNTVhLEmsnbQ7] -- This message was sent by Atlassian JIRA (v7.6.3#76005)