> > > > * Bare Naked Local File System v0.1.0 doesn't (yet) support symlinks > or the sticky bit. > ok to not support symlinks. The symlinks of HDFS are not being maintained and I am not aware of anything relying on it. So I assume people don't need it.
Sticky bit would be useful, I guess. I suppose folks working at Microsoft would be more interested in this work? Last time I heard, Gautham and Inigo were revamping Hadoop's Windows support. > * But the bigger issue is how to excise Winutils completely in the > existing Hadoop code. Winutils assumptions are hard-coded at a low > level across various classes—even code that has nothing to do with > the file system. The startup configuration for example calls > `StringUtils.equalsIgnoreCase("true", valueString)` which loads the > `StringUtils` class, which has a static reference to `Shell`, which > has a static block that checks for `WINUTILS_EXE`. > * For the most part there should no longer even be a need for anything > but direct Java API access for the local file system. But muddling > things further, the existing `RawLocalFileSystem` implementation has > /four/ ways to access the local file system: Winutils, JNI calls, > shell access, and a "new" approach using "stat". The "stat" approach > has been switched off with a hard-coded `useDeprecatedFileStatus = > true` because of HADOOP-9652 > <https://issues.apache.org/jira/browse/HADOOP-9652>. > * Local file access is not contained within `RawLocalFileSystem` but > is scattered across other classes; `FileUtil.readLink()` for example > (which `RawLocalFileSystem` calls because of the deprecation issue > above) uses the shell approach without any option to change it. > (This implementation-specific decision should have been contained > within the `FileSystem` implementation itself.) > > In short, it's a mess that has accumulated over years and getting worse, > charging high interest on what at first was a small, self-contained > technical debt. > > I would welcome the opportunity to clean up this mess. I'm probably as > qualified as anyone to make the changes. This is one of my areas of > expertise: I was designing a full abstract file system interface (with > pure-Java from-scratch implementations for the local file system, > Subversion, and WebDAV—even the WebDAV HTTP implementation was from > scratch) around the time Apache Nutch was getting off the ground. Most > recently I've worked on the Hadoop `FileSystem` API contracting for > LinkedIn, discovering (what I consider to be) a huge bug in > ViewFilesystem, HADOOP-18525 > <https://issues.apache.org/jira/browse/HADOOP-18525>. > > The cleanup should be done in several stages (e.g. consolidating > WinUtils access; replacing code with pure Java API calls; undeprecating > the new Stat code and relegating it to a different class, etc.). > Unfortunately it's not financially feasible for me to sit here for > several months and revamp the Hadoop `FileSystem` subsystem for fun > (even though I wish I could). Perhaps there is job opening at a company > related to Hadoop that would be interested in hiring me and devoting a > certain percentage of my time to fixing local `FileSystem` access. If > so, let me know where I should send my resume > <https://www.garretwilson.com/about/resume>. > > Otherwise let me know if any ideas for a way forward. If there proves to > be interest in GlobalMentor Hadoop Bare Naked Local FileSystem > <https://github.com/globalmentor/hadoop-bare-naked-local-fs> on GitHub > I'll try to maintain and improve it, but really what needs to be > revamped is the Hadoop codebase itself. I'll be happy when Hadoop is > fixed so that both Steve's code and my code are no longer needed. > > Garret >