Re: Run spark unit test on Windows 7

Denny Lee Thu, 03 Jul 2014 11:35:27 -0700

Thanks!  will take a look at this later today. HTH!



> On Jul 3, 2014, at 11:09 AM, Kostiantyn Kudriavtsev 
> <kudryavtsev.konstan...@gmail.com> wrote:
> 
> Hi Denny,
> 
> just created https://issues.apache.org/jira/browse/SPARK-2356
> 
>> On Jul 3, 2014, at 7:06 PM, Denny Lee <denny.g....@gmail.com> wrote:
>> 
>> Hi Konstantin,
>> 
>> Could you please create a jira item at: 
>> https://issues.apache.org/jira/browse/SPARK/ so this issue can be tracked?
>> 
>> Thanks,
>> Denny
>> 
>> 
>>> On July 2, 2014 at 11:45:24 PM, Konstantin Kudryavtsev 
>>> (kudryavtsev.konstan...@gmail.com) wrote:
>>> 
>>> It sounds really strange...
>>> 
>>> I guess it is a bug, critical bug and must be fixed... at least some flag 
>>> must be add (unable.hadoop)
>>> 
>>> I found the next workaround :
>>> 1) download compiled winutils.exe from 
>>> http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
>>> 2) put this file into d:\winutil\bin
>>> 3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")
>>> 
>>> after that test runs
>>> 
>>> Thank you,
>>> Konstantin Kudryavtsev
>>> 
>>> 
>>> On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <denny.g....@gmail.com> wrote:
>>> You don't actually need it per se - its just that some of the Spark 
>>> libraries are referencing Hadoop libraries even if they ultimately do not 
>>> call them. When I was doing some early builds of Spark on Windows, I 
>>> admittedly had Hadoop on Windows running as well and had not run into this 
>>> particular issue.
>>> 
>>> 
>>> 
>>>> On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev 
>>>> <kudryavtsev.konstan...@gmail.com> wrote:
>>>> No, I don’t
>>>> 
>>>> why do I need to have HDP installed? I don’t use Hadoop at all and I’d 
>>>> like to read data from local filesystem
>>>> 
>>>>> On Jul 2, 2014, at 9:10 PM, Denny Lee <denny.g....@gmail.com> wrote:
>>>>> 
>>>>> By any chance do you have HDP 2.1 installed? you may need to install the 
>>>>> utils and update the env variables per 
>>>>> http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
>>>>> 
>>>>> 
>>>>>> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev 
>>>>>> <kudryavtsev.konstan...@gmail.com> wrote:
>>>>>> 
>>>>>> Hi Andrew,
>>>>>> 
>>>>>> it's windows 7 and I doesn't set up any env variables here 
>>>>>> 
>>>>>> The full stack trace:
>>>>>> 
>>>>>> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop 
>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in 
>>>>>> the hadoop binary path
>>>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe 
>>>>>> in the Hadoop binaries.
>>>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>>>> at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
>>>>>> at 
>>>>>> org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
>>>>>> at 
>>>>>> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
>>>>>> at 
>>>>>> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>>>>>> at 
>>>>>> org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
>>>>>> at 
>>>>>> org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>>>>>> at 
>>>>>> org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
>>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
>>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
>>>>>> at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at 
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> at 
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>> at junit.framework.TestCase.runTest(TestCase.java:168)
>>>>>> at junit.framework.TestCase.runBare(TestCase.java:134)
>>>>>> at junit.framework.TestResult$1.protect(TestResult.java:110)
>>>>>> at junit.framework.TestResult.runProtected(TestResult.java:128)
>>>>>> at junit.framework.TestResult.run(TestResult.java:113)
>>>>>> at junit.framework.TestCase.run(TestCase.java:124)
>>>>>> at junit.framework.TestSuite.runTest(TestSuite.java:232)
>>>>>> at junit.framework.TestSuite.run(TestSuite.java:227)
>>>>>> at 
>>>>>> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
>>>>>> at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
>>>>>> at 
>>>>>> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
>>>>>> at 
>>>>>> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
>>>>>> at 
>>>>>> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at 
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> at 
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
>>>>>> 
>>>>>> 
>>>>>> Thank you,
>>>>>> Konstantin Kudryavtsev
>>>>>> 
>>>>>> 
>>>>>>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <and...@databricks.com> wrote:
>>>>>>> Hi Konstatin,
>>>>>>> 
>>>>>>> We use hadoop as a library in a few places in Spark. I wonder why the 
>>>>>>> path includes "null" though.
>>>>>>> 
>>>>>>> Could you provide the full stack trace?
>>>>>>> 
>>>>>>> Andrew
>>>>>>> 
>>>>>>> 
>>>>>>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev 
>>>>>>> <kudryavtsev.konstan...@gmail.com>:
>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> I'm trying to run some transformation on Spark, it works fine on 
>>>>>>>> cluster (YARN, linux machines). However, when I'm trying to run it on 
>>>>>>>> local machine (Windows 7) under unit test, I got errors:
>>>>>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe 
>>>>>>>> in the Hadoop binaries.
>>>>>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>>>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>>>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>>>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>>>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>>>>>> 
>>>>>>>> My code is following:
>>>>>>>> @Test
>>>>>>>> def testETL() = {
>>>>>>>>     val conf = new SparkConf()
>>>>>>>>     val sc = new SparkContext("local", "test", conf)
>>>>>>>>     try {
>>>>>>>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>>>>>>> 
>>>>>>>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>>>>>>> 
>>>>>>>>         etl.etl(data) // rdd transformation, no access to SparkContext 
>>>>>>>> or Hadoop
>>>>>>>>         Assert.assertTrue(true)
>>>>>>>>     } finally {
>>>>>>>>         if(sc != null)
>>>>>>>>             sc.stop()
>>>>>>>>     }
>>>>>>>> }
>>>>>>>> 
>>>>>>>> Why is it trying to access hadoop at all? and how can I fix it? Thank 
>>>>>>>> you in advance
>>>>>>>> 
>>>>>>>> Thank you,
>>>>>>>> Konstantin Kudryavtsev
>

Re: Run spark unit test on Windows 7

Reply via email to