Mukul Kumar Singh created HDFS-13024:
----------------------------------------

             Summary: Ozone: ContainerStateMachine should synchronize 
operations between createContainer op and writeChunk
                 Key: HDFS-13024
                 URL: https://issues.apache.org/jira/browse/HDFS-13024
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: ozone
    Affects Versions: HDFS-7240
            Reporter: Mukul Kumar Singh
            Assignee: Mukul Kumar Singh
             Fix For: HDFS-7240


This issue happens after HDFS-12853. with HDFS-12853, writeChunk op has been 
divided into two stages 1) the log write phase (here the state machine data is 
written) 2) ApplyTransaction.

With a 3 node ratis ring, ratis leader will append the log entry to its log and 
forward it to its followers. However there is no guarantee on when the 
followers will apply the log to the state machine in {{applyTransaction}}. 

This issue happens in the following order

1) Leader accepts create container
2) Leader add entries to its logs and forwards to followers
3) Followers append the entry to its log and Ack to the raft leader (Please 
note that the transaction still hasn't been applied)
4) Leader applies the transaction and now replies
5) write chunk call is sent to the Leader
6) Leader now forwards the call to the followers
7) Followers try to apply the log by calling {{Dispatcher#dispatch}} however 
the create container call in 3) still hasn't been applied
8) write chunk call on followers fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to