I can say from my experience that getting Spark to work with Hadoop 2
is not for the beginner; after solving one problem after another
(dependencies, scripts, etc.), I went back to Hadoop 1.

Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
why, but, given so, Hadoop 2 has too many bumps

On 7/6/14, Marco Shaw <[email protected]> wrote:
> That is confusing based on the context you provided.
>
> This might take more time than I can spare to try to understand.
>
> For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>
> Cloudera's CDH 5 express VM includes Spark, but the service isn't running by
> default.
>
> I can't remember for MapR...
>
> Marco
>
>> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>> <[email protected]> wrote:
>>
>> Marco,
>>
>> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>> can try
>> from
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>
>> On other hand, http://spark.apache.org/ said "
>> Integrated with Hadoop
>> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>> existing Hadoop data.
>>
>> If you have a Hadoop 2 cluster, you can run Spark without any installation
>> needed. "
>>
>> And this is confusing for me... do I need rpm installation on not?...
>>
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>>
>>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <[email protected]>
>>> wrote:
>>> Can you provide links to the sections that are confusing?
>>>
>>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>>> binaries do.
>>>
>>> Now, you can also install Hortonworks Spark RPM...
>>>
>>> For production, in my opinion, RPMs are better for manageability.
>>>
>>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>>> <[email protected]> wrote:
>>>>
>>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>>>> install spark rpm on each node, but on Spark main page said that yarn
>>>> enough and I don't need to install it... What the difference?
>>>>
>>>> sent from my HTC
>>>>
>>>>> On Jul 6, 2014 8:34 PM, "vs" <[email protected]> wrote:
>>>>> Konstantin,
>>>>>
>>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can
>>>>> try
>>>>> from
>>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>>>
>>>>> Let me know if you see issues with the tech preview.
>>>>>
>>>>> "spark PI example on HDP 2.0
>>>>>
>>>>> I downloaded spark 1.0 pre-build from
>>>>> http://spark.apache.org/downloads.html
>>>>> (for HDP2)
>>>>> The run example from spark web-site:
>>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>>> --master
>>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g
>>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>>>>
>>>>> I got error:
>>>>> Application application_1404470405736_0044 failed 3 times due to AM
>>>>> Container for appattempt_1404470405736_0044_000003 exited with
>>>>> exitCode: 1
>>>>> due to: Exception from container-launch:
>>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>>> at
>>>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>>> at
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>>> at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>>> at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>> at java.lang.Thread.run(Thread.java:744)
>>>>> .Failing this attempt.. Failing the application.
>>>>>
>>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>>>> --executor-cores, 1,
>>>>> --num-executors, 3)
>>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>>>> Options:
>>>>>   --jar JAR_PATH       Path to your application's JAR file (required)
>>>>>   --class CLASS_NAME   Name of your application's main class
>>>>> (required)
>>>>> ...bla-bla-bla
>>>>> "
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>
>

Reply via email to