On Sun, 20 Feb 2022 at 18:42, Gautham Banasandra <gaur...@apache.org> wrote:
> Hi all, > > I've been working on getting Hadoop to build on Windows for quite some time > now. We're now at a stage where we can parallelize the effort and complete > this sooner. I've outlined the parts that are remaining. Please get in > touch with me if anyone wishes to join hands in realizing this goal. > > *Why do we need Hadoop to run on Windows?* > Windows has a very large user base. The modern alternative softwares to > Hadoop (like Kubernetes) are cross platform by design. We have to > acknowledge the fact it isn't easy to get Hadoop running on Windows. The > reason why we haven't seen much adoption of Hadoop on Windows is probably > because of issues like compilation, requiring work-arounds every step of > the way etc. If we were to nail these issues, I believe it would > tremendously expand the usage of Hadoop. > > > *Phase 3 : Resolving systemic issues* > 1. [HADOOP-13223] winutils.exe is a bug nexus and should be killed with an > axe. - ASF JIRA (apache.org) > <https://issues.apache.org/jira/browse/HADOOP-13223> > The Hadoop environment is modeled closer to that of Linux than Windows. > Thus, we see a lot of functional gaps between running Hadoop on Linux v/s > Windows, which have become the source of bugs when it comes to running > Hadoop on Windows. One such issue is that of winutils.exe. We can aim to > address issues like these in this phase. I plan to provide JNI > implementation for each platform and unify these under a common file system > interface. So that we get stack traces for exceptions thrown in these > layers and mostly so that we don't have any disparity between the > platforms. > > i for one endorse this jira. given a lot of it is for fs permissions, maybe whatever you do can downgrade, so that running spark local on a windows laptop becomes easy. those people do not need the posix permissions model