[ 
https://issues.apache.org/jira/browse/IGNITE-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Ozerov updated IGNITE-1697:
------------------------------------
    Fix Version/s:     (was: 1.5)
                   1.6

> IGFS: implement reliable Igfs failover logic 
> ---------------------------------------------
>
>                 Key: IGNITE-1697
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1697
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Veselovsky
>            Assignee: Vladimir Ozerov
>             Fix For: 1.6
>
>
> Problems to solve:
> 1) currently a write lock for a file may stay taken forever if a node have 
> taken the lock and then crashed.
> 2) Currently the blocks of file content are written not just as 
> dataCache.put() operations , but sent using ad-hoc async messages. This was 
> done earlier to improve performance. But in order to implement reliable 
> failover we need to get rid of that and use simple put() or asyncPut() cache 
> operations.
> Solution plan:
> 1) use async put to write file data blocks.
> 2) do writing using scheme "lock" -> "reserve space" -> "write" -> "commit" 
> -> "release lock".
> 3) The id of the node that locked a file should be readable from the lock id.
> 4) Upon taking a file lock the following procedure should be performed: 
> if file is locked, take the node Id of the node that locked the file. After 
> that ask DiscoveryProcessor if this node is alive. If it is not (node has 
> left topology), perform cleanup procedure: delete all the data blocks of the 
> reserved data range, then delete the lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to