Vladimir Ozerov wrote > This is how we fixed it for now. But it requires paths locking starting > from the root. Otherwise they can become parent-child in a moment after > sucessfull check.
I am not sure if we should be concerned about looking at the root or path here. This lock is only held for the duration of directory structure change. I don't think Hadoop/Spark jobs will be affected if we add an extra 10ms somewhere during the execution. Vladimir Ozerov wrote > On Thu, Sep 24, 2015 at 11:28 AM, Sergi Vladykin < > sergi.vladykin@ > > > wrote: > >> May be just check that they are not parent-child within the tx? >> >> Sergi >> Igniters, >> >> We revealed concurrency problem in IGFS and I would like to discuss >> possible solutions to it. >> >> Consider the following file system structure: >> root >> |-- A >> | |-- B >> | | |-- C >> | |-- D >> >> ... two concurrent operations in different threads: >> T1: move(/A/B, /A/D); >> T2: move(/A/D, /A/B/C); >> >> ... and how IGFS handles it now: >> T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to >> each >> other, etc. -> OK. >> T2: do the same for "A/D" and "A/B/C" -> OK. >> T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. >> T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later >> inside >> tx. >> >> T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform move >> -> >> OK. >> root >> |-- A >> | |-- D >> | | |-- B >> | | | |-- C >> >> T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and >> "/A/B/C" (*directory >> structure already changed at this time!*), perform move -> OK. >> root >> |-- A >> B >> |-- D >> | |-- C >> | | |-- B (loop!) >> >> File system is corrupted. Folders B, C and D are not reacheable from >> root. >> >> To fix this now we additionaly check if directory structure is still >> valid *inside >> transaction*. It works, no more corruptions. But it requres taking locks >> on >> the whole paths *including root*. So move, delete and mkdirs opeartions >> *can >> no longer be concurrent*. >> >> Probably there is a way to relax this while still ensuring consistency, >> but >> I do not see how. One idea is to store real path inside each entry. This >> way we will be able to ensure that it is still at a valid location >> without >> blocking parents, so concurrnecy will be restored. But we will have to >> propagate strucutral changes to children. E.g. move of a folder with 100 >> items will lead to update of >100 cache entries. Not so good. >> >> Any other ideas? >> >> Vladimir. >> -- View this message in context: http://apache-ignite-developers.2346864.n4.nabble.com/IGFS-concurrency-issue-tp3449p3734.html Sent from the Apache Ignite Developers mailing list archive at Nabble.com.