[ https://issues.apache.org/jira/browse/IGNITE-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladimir Ozerov updated IGNITE-1697: ------------------------------------ Fix Version/s: (was: 1.5) 1.6 > IGFS: implement reliable Igfs failover logic > --------------------------------------------- > > Key: IGNITE-1697 > URL: https://issues.apache.org/jira/browse/IGNITE-1697 > Project: Ignite > Issue Type: Bug > Reporter: Ivan Veselovsky > Assignee: Vladimir Ozerov > Fix For: 1.6 > > > Problems to solve: > 1) currently a write lock for a file may stay taken forever if a node have > taken the lock and then crashed. > 2) Currently the blocks of file content are written not just as > dataCache.put() operations , but sent using ad-hoc async messages. This was > done earlier to improve performance. But in order to implement reliable > failover we need to get rid of that and use simple put() or asyncPut() cache > operations. > Solution plan: > 1) use async put to write file data blocks. > 2) do writing using scheme "lock" -> "reserve space" -> "write" -> "commit" > -> "release lock". > 3) The id of the node that locked a file should be readable from the lock id. > 4) Upon taking a file lock the following procedure should be performed: > if file is locked, take the node Id of the node that locked the file. After > that ask DiscoveryProcessor if this node is alive. If it is not (node has > left topology), perform cleanup procedure: delete all the data blocks of the > reserved data range, then delete the lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)