Mukul Kumar Singh created HDFS-13024: ----------------------------------------
Summary: Ozone: ContainerStateMachine should synchronize operations between createContainer op and writeChunk Key: HDFS-13024 URL: https://issues.apache.org/jira/browse/HDFS-13024 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh Fix For: HDFS-7240 This issue happens after HDFS-12853. with HDFS-12853, writeChunk op has been divided into two stages 1) the log write phase (here the state machine data is written) 2) ApplyTransaction. With a 3 node ratis ring, ratis leader will append the log entry to its log and forward it to its followers. However there is no guarantee on when the followers will apply the log to the state machine in {{applyTransaction}}. This issue happens in the following order 1) Leader accepts create container 2) Leader add entries to its logs and forwards to followers 3) Followers append the entry to its log and Ack to the raft leader (Please note that the transaction still hasn't been applied) 4) Leader applies the transaction and now replies 5) write chunk call is sent to the Leader 6) Leader now forwards the call to the followers 7) Followers try to apply the log by calling {{Dispatcher#dispatch}} however the create container call in 3) still hasn't been applied 8) write chunk call on followers fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org