Re: PySpark without PySpark

Ashish Dutt Wed, 08 Jul 2015 21:53:45 -0700

Hi Sujit,
Thanks for your response.

So i opened a new notebook using the command ipython notebook --profile
spark and tried the sequence of commands. i am getting errors. Attached is
the screenshot of the same.
Also I am attaching the  00-pyspark-setup.py for your reference. Looks
like, I have written something wrong here. Cannot seem to figure out, what
is it?


Thank you for your help


Sincerely,
Ashish Dutt

On Thu, Jul 9, 2015 at 11:53 AM, Sujit Pal <sujitatgt...@gmail.com> wrote:

> Hi Ashish,
>
> >> Nice post.
> Agreed, kudos to the author of the post, Benjamin Benfort of District Labs.
>
> >> Following your post, I get this problem;
> Again, not my post.
>
> I did try setting up IPython with the Spark profile for the edX Intro to
> Spark course (because I didn't want to use the Vagrant container) and it
> worked flawlessly with the instructions provided (on OSX). I haven't used
> the IPython/PySpark environment beyond very basic tasks since then though,
> because my employer has a Databricks license which we were already using
> for other stuff and we ended up doing the labs on Databricks.
>
> Looking at your screenshot though, I don't see why you think its picking
> up the default profile. One simple way of checking to see if things are
> working is to open a new notebook and try this sequence of commands:
>
> from pyspark import SparkContext
> sc = SparkContext("local", "pyspark")
> sc
>
> You should see something like this after a little while:
> <pyspark.context.SparkContext at 0x1093c9b10>
>
> While the context is being instantiated, you should also see lots of log
> lines scroll by on the terminal where you started the "ipython notebook
> --profile spark" command - these log lines are from Spark.
>
> Hope this helps,
> Sujit
>
>
> On Wed, Jul 8, 2015 at 6:04 PM, Ashish Dutt <ashish.du...@gmail.com>
> wrote:
>
>> Hi Sujit,
>> Nice post.. Exactly what I had been looking for.
>> I am relatively a beginner with Spark and real time data processing.
>> We have a server with CDH5.4 with 4 nodes. The spark version in our
>> server is 1.3.0
>> On my laptop I have spark 1.3.0 too and its using Windows 7 environment.
>> As per point 5 of your post I am able to invoke pyspark locally as in a
>> standalone mode.
>>
>> Following your post, I get this problem;
>>
>> 1. In section "Using Ipython notebook with spark" I cannot understand why
>> it is picking up the default profile and not the pyspark profile. I am sure
>> it is because of the path variables. Attached is the screenshot. Can you
>> suggest how to solve this.
>>
>> Current the path variables for my laptop are like
>> SPARK_HOME="C:\SPARK-1.3.0\BIN", JAVA_HOME="C:\PROGRAM
>> FILES\JAVA\JDK1.7.0_79", HADOOP_HOME="D:\WINUTILS", M2_HOME="D:\MAVEN\BIN",
>> MAVEN_HOME="D:\MAVEN\BIN", PYTHON_HOME="C:\PYTHON27\", SBT_HOME="C:\SBT\"
>>
>>
>> Sincerely,
>> Ashish Dutt
>> PhD Candidate
>> Department of Information Systems
>> University of Malaya, Lembah Pantai,
>> 50603 Kuala Lumpur, Malaysia
>>
>> On Thu, Jul 9, 2015 at 4:56 AM, Sujit Pal <sujitatgt...@gmail.com> wrote:
>>
>>> You are welcome Davies. Just to clarify, I didn't write the post (not
>>> sure if my earlier post gave that impression, apologize if so), although I
>>> agree its great :-).
>>>
>>> -sujit
>>>
>>>
>>> On Wed, Jul 8, 2015 at 10:36 AM, Davies Liu <dav...@databricks.com>
>>> wrote:
>>>
>>>> Great post, thanks for sharing with us!
>>>>
>>>> On Wed, Jul 8, 2015 at 9:59 AM, Sujit Pal <sujitatgt...@gmail.com>
>>>> wrote:
>>>> > Hi Julian,
>>>> >
>>>> > I recently built a Python+Spark application to do search relevance
>>>> > analytics. I use spark-submit to submit PySpark jobs to a Spark
>>>> cluster on
>>>> > EC2 (so I don't use the PySpark shell, hopefully thats what you are
>>>> looking
>>>> > for). Can't share the code, but the basic approach is covered in this
>>>> blog
>>>> > post - scroll down to the section "Writing a Spark Application".
>>>> >
>>>> >
>>>> https://districtdatalabs.silvrback.com/getting-started-with-spark-in-python
>>>> >
>>>> > Hope this helps,
>>>> >
>>>> > -sujit
>>>> >
>>>> >
>>>> > On Wed, Jul 8, 2015 at 7:46 AM, Julian <julian+sp...@magnetic.com>
>>>> wrote:
>>>> >>
>>>> >> Hey.
>>>> >>
>>>> >> Is there a resource that has written up what the necessary steps are
>>>> for
>>>> >> running PySpark without using the PySpark shell?
>>>> >>
>>>> >> I can reverse engineer (by following the tracebacks and reading the
>>>> shell
>>>> >> source) what the relevant Java imports needed are, but I would assume
>>>> >> someone has attempted this before and just published something I can
>>>> >> either
>>>> >> follow or install? If not, I have something that pretty much works
>>>> and can
>>>> >> publish it, but I'm not a heavy Spark user, so there may be some
>>>> things
>>>> >> I've
>>>> >> left out that I haven't hit because of how little of pyspark I'm
>>>> playing
>>>> >> with.
>>>> >>
>>>> >> Thanks,
>>>> >> Julian
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> View this message in context:
>>>> >>
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-without-PySpark-tp23719.html
>>>> >> Sent from the Apache Spark User List mailing list archive at
>>>> Nabble.com.
>>>> >>
>>>> >> ---------------------------------------------------------------------
>>>> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> >> For additional commands, e-mail: user-h...@spark.apache.org
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: PySpark without PySpark

Reply via email to