Thanks! will take a look at this later today. HTH!
> On Jul 3, 2014, at 11:09 AM, Kostiantyn Kudriavtsev > <kudryavtsev.konstan...@gmail.com> wrote: > > Hi Denny, > > just created https://issues.apache.org/jira/browse/SPARK-2356 > >> On Jul 3, 2014, at 7:06 PM, Denny Lee <denny.g....@gmail.com> wrote: >> >> Hi Konstantin, >> >> Could you please create a jira item at: >> https://issues.apache.org/jira/browse/SPARK/ so this issue can be tracked? >> >> Thanks, >> Denny >> >> >>> On July 2, 2014 at 11:45:24 PM, Konstantin Kudryavtsev >>> (kudryavtsev.konstan...@gmail.com) wrote: >>> >>> It sounds really strange... >>> >>> I guess it is a bug, critical bug and must be fixed... at least some flag >>> must be add (unable.hadoop) >>> >>> I found the next workaround : >>> 1) download compiled winutils.exe from >>> http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight >>> 2) put this file into d:\winutil\bin >>> 3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\") >>> >>> after that test runs >>> >>> Thank you, >>> Konstantin Kudryavtsev >>> >>> >>> On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <denny.g....@gmail.com> wrote: >>> You don't actually need it per se - its just that some of the Spark >>> libraries are referencing Hadoop libraries even if they ultimately do not >>> call them. When I was doing some early builds of Spark on Windows, I >>> admittedly had Hadoop on Windows running as well and had not run into this >>> particular issue. >>> >>> >>> >>>> On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev >>>> <kudryavtsev.konstan...@gmail.com> wrote: >>>> No, I don’t >>>> >>>> why do I need to have HDP installed? I don’t use Hadoop at all and I’d >>>> like to read data from local filesystem >>>> >>>>> On Jul 2, 2014, at 9:10 PM, Denny Lee <denny.g....@gmail.com> wrote: >>>>> >>>>> By any chance do you have HDP 2.1 installed? you may need to install the >>>>> utils and update the env variables per >>>>> http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows >>>>> >>>>> >>>>>> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev >>>>>> <kudryavtsev.konstan...@gmail.com> wrote: >>>>>> >>>>>> Hi Andrew, >>>>>> >>>>>> it's windows 7 and I doesn't set up any env variables here >>>>>> >>>>>> The full stack trace: >>>>>> >>>>>> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop >>>>>> library for your platform... using builtin-java classes where applicable >>>>>> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in >>>>>> the hadoop binary path >>>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe >>>>>> in the Hadoop binaries. >>>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318) >>>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333) >>>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326) >>>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76) >>>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93) >>>>>> at org.apache.hadoop.security.Groups.<init>(Groups.java:77) >>>>>> at >>>>>> org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240) >>>>>> at >>>>>> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255) >>>>>> at >>>>>> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283) >>>>>> at >>>>>> org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36) >>>>>> at >>>>>> org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109) >>>>>> at >>>>>> org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala) >>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:228) >>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:97) >>>>>> at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13) >>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>> at >>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>>> at >>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>>>> at junit.framework.TestCase.runTest(TestCase.java:168) >>>>>> at junit.framework.TestCase.runBare(TestCase.java:134) >>>>>> at junit.framework.TestResult$1.protect(TestResult.java:110) >>>>>> at junit.framework.TestResult.runProtected(TestResult.java:128) >>>>>> at junit.framework.TestResult.run(TestResult.java:113) >>>>>> at junit.framework.TestCase.run(TestCase.java:124) >>>>>> at junit.framework.TestSuite.runTest(TestSuite.java:232) >>>>>> at junit.framework.TestSuite.run(TestSuite.java:227) >>>>>> at >>>>>> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81) >>>>>> at org.junit.runner.JUnitCore.run(JUnitCore.java:130) >>>>>> at >>>>>> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74) >>>>>> at >>>>>> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211) >>>>>> at >>>>>> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67) >>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>> at >>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>>> at >>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>>>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) >>>>>> >>>>>> >>>>>> Thank you, >>>>>> Konstantin Kudryavtsev >>>>>> >>>>>> >>>>>>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <and...@databricks.com> wrote: >>>>>>> Hi Konstatin, >>>>>>> >>>>>>> We use hadoop as a library in a few places in Spark. I wonder why the >>>>>>> path includes "null" though. >>>>>>> >>>>>>> Could you provide the full stack trace? >>>>>>> >>>>>>> Andrew >>>>>>> >>>>>>> >>>>>>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev >>>>>>> <kudryavtsev.konstan...@gmail.com>: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I'm trying to run some transformation on Spark, it works fine on >>>>>>>> cluster (YARN, linux machines). However, when I'm trying to run it on >>>>>>>> local machine (Windows 7) under unit test, I got errors: >>>>>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe >>>>>>>> in the Hadoop binaries. >>>>>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318) >>>>>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333) >>>>>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326) >>>>>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76) >>>>>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93) >>>>>>>> >>>>>>>> My code is following: >>>>>>>> @Test >>>>>>>> def testETL() = { >>>>>>>> val conf = new SparkConf() >>>>>>>> val sc = new SparkContext("local", "test", conf) >>>>>>>> try { >>>>>>>> val etl = new IxtoolsDailyAgg() // empty constructor >>>>>>>> >>>>>>>> val data = sc.parallelize(List("in1", "in2", "in3")) >>>>>>>> >>>>>>>> etl.etl(data) // rdd transformation, no access to SparkContext >>>>>>>> or Hadoop >>>>>>>> Assert.assertTrue(true) >>>>>>>> } finally { >>>>>>>> if(sc != null) >>>>>>>> sc.stop() >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> Why is it trying to access hadoop at all? and how can I fix it? Thank >>>>>>>> you in advance >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Konstantin Kudryavtsev >