On 16 Jan 2017, at 11:06, Hyukjin Kwon 
<gurwls...@gmail.com<mailto:gurwls...@gmail.com>> wrote:

Hi,

I just looked through Jacek's page and I believe that is the correct way.

That seems to be a Hadoop library specific issue[1]. Up to my knowledge, 
winutils and the binaries in the private repo
 are built by a Hadoop PMC member on a dedicated Windows VM which I believe are 
pretty trustable.

thank you :)

I also check out and build the specific git commit SHA1 of the release, not any 
(moveable) tag, so we have identical sources for my builds as the matching 
releases.

This can be compile from the source. If you think it is not reliable and not 
safe, you can go and build it by your self.

I agree it would be great if there are documentation about this as we have a 
weak promise for Windows[2] and
I believe it always require some overhead to install Spark on Windows. FWIW, In 
case of SparkR, there are some
documentation [3].

For bundling it, it seems even Hadoop itself does not include this in their 
releases. I think documentation would be
enough.

Really, Hadoop itself should be doing the release of the windows binaries. It's 
just it complicates the release process as the linux build/test/release would 
have to be done, then somehow the windows stuff would need to be done on 
another machine and mixed in. That's the real barrier: extra work. That said, 
maybe it's time.




For many JIRAs, at least I am resolving it one by one.

I hope my answer is helpful and makes sense.

Thanks.


[1] https://wiki.apache.org/hadoop/WindowsProblems
[2] 
https://github.com/apache/spark/blob/f3a3fed76cb74ecd0f46031f337576ce60f54fb2/docs/index.md
[3] https://github.com/apache/spark/blob/master/R/WINDOWS.md


2017-01-16 19:35 GMT+09:00 assaf.mendelson 
<assaf.mendel...@rsa.com<mailto:assaf.mendel...@rsa.com>>:
Hi,
In the documentation it says spark is supported on windows.
The problem, however, is that the documentation description on windows is 
lacking. There are sources (such as 
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-tips-and-tricks-running-spark-windows.html
 and many more) which explain how to make spark run on windows, however, they 
all involve downloading a third party winutil.exe file.
Since this file is downloaded from a repository belonging to a private person, 
this can be an issue (e.g. getting approval to install on a company computer 
can be an issue).
There are tons of jira tickets on the subject (most are marked as duplicate or 
not a problem), however, I believe that if we say spark is supported on windows 
there should be a clear explanation on how to run it and one shouldn’t have to 
use executable from a private person.

If indeed using winutil.exe is the correct solution, I believe it should be 
bundled to the spark binary distribution along with clear instructions on how 
to add it.
Assaf.

________________________________
View this message in context: spark support on 
windows<http://apache-spark-developers-list.1001551.n3.nabble.com/spark-support-on-windows-tp20614.html>
Sent from the Apache Spark Developers List mailing list 
archive<http://apache-spark-developers-list.1001551.n3.nabble.com/> at 
Nabble.com<http://Nabble.com>.


Reply via email to