On 16 Jan 2017, at 11:06, Hyukjin Kwon <gurwls...@gmail.com<mailto:gurwls...@gmail.com>> wrote:
Hi, I just looked through Jacek's page and I believe that is the correct way. That seems to be a Hadoop library specific issue[1]. Up to my knowledge, winutils and the binaries in the private repo are built by a Hadoop PMC member on a dedicated Windows VM which I believe are pretty trustable. thank you :) I also check out and build the specific git commit SHA1 of the release, not any (moveable) tag, so we have identical sources for my builds as the matching releases. This can be compile from the source. If you think it is not reliable and not safe, you can go and build it by your self. I agree it would be great if there are documentation about this as we have a weak promise for Windows[2] and I believe it always require some overhead to install Spark on Windows. FWIW, In case of SparkR, there are some documentation [3]. For bundling it, it seems even Hadoop itself does not include this in their releases. I think documentation would be enough. Really, Hadoop itself should be doing the release of the windows binaries. It's just it complicates the release process as the linux build/test/release would have to be done, then somehow the windows stuff would need to be done on another machine and mixed in. That's the real barrier: extra work. That said, maybe it's time. For many JIRAs, at least I am resolving it one by one. I hope my answer is helpful and makes sense. Thanks. [1] https://wiki.apache.org/hadoop/WindowsProblems [2] https://github.com/apache/spark/blob/f3a3fed76cb74ecd0f46031f337576ce60f54fb2/docs/index.md [3] https://github.com/apache/spark/blob/master/R/WINDOWS.md 2017-01-16 19:35 GMT+09:00 assaf.mendelson <assaf.mendel...@rsa.com<mailto:assaf.mendel...@rsa.com>>: Hi, In the documentation it says spark is supported on windows. The problem, however, is that the documentation description on windows is lacking. There are sources (such as https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-tips-and-tricks-running-spark-windows.html and many more) which explain how to make spark run on windows, however, they all involve downloading a third party winutil.exe file. Since this file is downloaded from a repository belonging to a private person, this can be an issue (e.g. getting approval to install on a company computer can be an issue). There are tons of jira tickets on the subject (most are marked as duplicate or not a problem), however, I believe that if we say spark is supported on windows there should be a clear explanation on how to run it and one shouldn’t have to use executable from a private person. If indeed using winutil.exe is the correct solution, I believe it should be bundled to the spark binary distribution along with clear instructions on how to add it. Assaf. ________________________________ View this message in context: spark support on windows<http://apache-spark-developers-list.1001551.n3.nabble.com/spark-support-on-windows-tp20614.html> Sent from the Apache Spark Developers List mailing list archive<http://apache-spark-developers-list.1001551.n3.nabble.com/> at Nabble.com<http://Nabble.com>.