Ah, that makes sense - sorry I jumped to obviously wrong conclusion based on existing HDFS secondary storage implementation.
Cos On Tue, Oct 13, 2015 at 09:41AM, Vladimir Ozerov wrote: > Cos, > > We are trying to keep IGFS decoupled from HDFS. Instead, we define IGFS as > an in-memory file system which could optionally be a cache some secondary > file system. And HDFS is one of these file systems. > > For this reason we cannot rely on any specific behavior of particular > secondary file system. > > > On Tue, Oct 13, 2015 at 8:14 AM, Konstantin Boudnik <c...@apache.org> wrote: > > > Thanks Vladimir. In the case of HDFS, the global lock for conflicting > > simultaneous changes ie > > > > mkdir /a/b/c > > together with > > mv /a/b /a/d > > > > will trigger the NN global lock on the file system so the latter operation > > won't start until the former is completed. Hence, the point of my > > confusion is > > why IGFS is adding an additional lock on top of what the secondary file > > system > > does anyway. Sorry, if I haven't expressed myself in a better way from the > > beginning. > > > > Cos > > > > On Thu, Oct 08, 2015 at 11:33AM, Vladimir Ozerov wrote: > > > Cos, > > > > > > Initially IGFS was designed to support concurrent structural updates. > > E.g., > > > creation of directories /A/C/D and /E/F/G can be performed concurrently. > > > Now we revealed that it might cause concurrency issues in case of > > > conflicting updates. > > > > > > To fix this problem we now perform every update holding a kind of global > > > file system lock. This doesn't affect file read/write operations - they > > > still can be performed concurrently of course. The question was whether > > > this could cause severe performance degradation or not. If we assume that > > > in average Hadoop jobs file operations dominate over structural > > operations > > > on directories, then it should not cause any significant performance > > issues. > > > > > > Vladimir. > > > > > > On Wed, Oct 7, 2015 at 10:38 PM, Konstantin Boudnik <c...@apache.org> > > wrote: > > > > > > > On Wed, Oct 07, 2015 at 09:11AM, Vladimir Ozerov wrote: > > > > > Cos, > > > > > Yes, no long-time locking is expected here. > > > > > > > > Sorry, I musta be still dense from the jet-lag: could you put in a bit > > more > > > > details? Thanks in advance! > > > > > > > > Cos > > > > > > > > > On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik <c...@apache.org> > > > > wrote: > > > > > > > > > > > IIRC NN should be locking on these ops anyway, shouldn't it? The > > > > situation > > > > > > is > > > > > > no different if multiple clients are doing these operations > > > > > > near-simultaneously. Unless I missed something here... > > > > > > > > > > > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > > > > > > May be just check that they are not parent-child within the tx? > > > > > > > > > > > > > > Sergi > > > > > > > Igniters, > > > > > > > > > > > > > > We revealed concurrency problem in IGFS and I would like to > > discuss > > > > > > > possible solutions to it. > > > > > > > > > > > > > > Consider the following file system structure: > > > > > > > root > > > > > > > |-- A > > > > > > > | |-- B > > > > > > > | | |-- C > > > > > > > | |-- D > > > > > > > > > > > > > > ... two concurrent operations in different threads: > > > > > > > T1: move(/A/B, /A/D); > > > > > > > T2: move(/A/D, /A/B/C); > > > > > > > > > > > > > > ... and how IGFS handles it now: > > > > > > > T1: verify that "/A/B" and "/A/D" exist, they are not > > child-parent to > > > > > > each > > > > > > > other, etc. -> OK. > > > > > > > T2: do the same for "A/D" and "A/B/C" -> OK. > > > > > > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside > > tx. > > > > > > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them > > later > > > > > > inside > > > > > > > tx. > > > > > > > > > > > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", > > perform > > > > move > > > > > > -> > > > > > > > OK. > > > > > > > root > > > > > > > |-- A > > > > > > > | |-- D > > > > > > > | | |-- B > > > > > > > | | | |-- C > > > > > > > > > > > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > > > > > > "/A/B/C" (*directory > > > > > > > structure already changed at this time!*), perform move -> OK. > > > > > > > root > > > > > > > |-- A > > > > > > > B > > > > > > > |-- D > > > > > > > | |-- C > > > > > > > | | |-- B (loop!) > > > > > > > > > > > > > > File system is corrupted. Folders B, C and D are not reacheable > > from > > > > > > root. > > > > > > > > > > > > > > To fix this now we additionaly check if directory structure is > > still > > > > > > > valid *inside > > > > > > > transaction*. It works, no more corruptions. But it requres > > taking > > > > locks > > > > > > on > > > > > > > the whole paths *including root*. So move, delete and mkdirs > > > > opeartions > > > > > > *can > > > > > > > no longer be concurrent*. > > > > > > > > > > > > > > Probably there is a way to relax this while still ensuring > > > > consistency, > > > > > > but > > > > > > > I do not see how. One idea is to store real path inside each > > entry. > > > > This > > > > > > > way we will be able to ensure that it is still at a valid > > location > > > > > > without > > > > > > > blocking parents, so concurrnecy will be restored. But we will > > have > > > > to > > > > > > > propagate strucutral changes to children. E.g. move of a folder > > with > > > > 100 > > > > > > > items will lead to update of >100 cache entries. Not so good. > > > > > > > > > > > > > > Any other ideas? > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > >
signature.asc
Description: Digital signature