Hi Sujit, Thanks for your response. So i opened a new notebook using the command ipython notebook --profile spark and tried the sequence of commands. i am getting errors. Attached is the screenshot of the same. Also I am attaching the 00-pyspark-setup.py for your reference. Looks like, I have written something wrong here. Cannot seem to figure out, what is it?
Thank you for your help Sincerely, Ashish Dutt On Thu, Jul 9, 2015 at 11:53 AM, Sujit Pal <sujitatgt...@gmail.com> wrote: > Hi Ashish, > > >> Nice post. > Agreed, kudos to the author of the post, Benjamin Benfort of District Labs. > > >> Following your post, I get this problem; > Again, not my post. > > I did try setting up IPython with the Spark profile for the edX Intro to > Spark course (because I didn't want to use the Vagrant container) and it > worked flawlessly with the instructions provided (on OSX). I haven't used > the IPython/PySpark environment beyond very basic tasks since then though, > because my employer has a Databricks license which we were already using > for other stuff and we ended up doing the labs on Databricks. > > Looking at your screenshot though, I don't see why you think its picking > up the default profile. One simple way of checking to see if things are > working is to open a new notebook and try this sequence of commands: > > from pyspark import SparkContext > sc = SparkContext("local", "pyspark") > sc > > You should see something like this after a little while: > <pyspark.context.SparkContext at 0x1093c9b10> > > While the context is being instantiated, you should also see lots of log > lines scroll by on the terminal where you started the "ipython notebook > --profile spark" command - these log lines are from Spark. > > Hope this helps, > Sujit > > > On Wed, Jul 8, 2015 at 6:04 PM, Ashish Dutt <ashish.du...@gmail.com> > wrote: > >> Hi Sujit, >> Nice post.. Exactly what I had been looking for. >> I am relatively a beginner with Spark and real time data processing. >> We have a server with CDH5.4 with 4 nodes. The spark version in our >> server is 1.3.0 >> On my laptop I have spark 1.3.0 too and its using Windows 7 environment. >> As per point 5 of your post I am able to invoke pyspark locally as in a >> standalone mode. >> >> Following your post, I get this problem; >> >> 1. In section "Using Ipython notebook with spark" I cannot understand why >> it is picking up the default profile and not the pyspark profile. I am sure >> it is because of the path variables. Attached is the screenshot. Can you >> suggest how to solve this. >> >> Current the path variables for my laptop are like >> SPARK_HOME="C:\SPARK-1.3.0\BIN", JAVA_HOME="C:\PROGRAM >> FILES\JAVA\JDK1.7.0_79", HADOOP_HOME="D:\WINUTILS", M2_HOME="D:\MAVEN\BIN", >> MAVEN_HOME="D:\MAVEN\BIN", PYTHON_HOME="C:\PYTHON27\", SBT_HOME="C:\SBT\" >> >> >> Sincerely, >> Ashish Dutt >> PhD Candidate >> Department of Information Systems >> University of Malaya, Lembah Pantai, >> 50603 Kuala Lumpur, Malaysia >> >> On Thu, Jul 9, 2015 at 4:56 AM, Sujit Pal <sujitatgt...@gmail.com> wrote: >> >>> You are welcome Davies. Just to clarify, I didn't write the post (not >>> sure if my earlier post gave that impression, apologize if so), although I >>> agree its great :-). >>> >>> -sujit >>> >>> >>> On Wed, Jul 8, 2015 at 10:36 AM, Davies Liu <dav...@databricks.com> >>> wrote: >>> >>>> Great post, thanks for sharing with us! >>>> >>>> On Wed, Jul 8, 2015 at 9:59 AM, Sujit Pal <sujitatgt...@gmail.com> >>>> wrote: >>>> > Hi Julian, >>>> > >>>> > I recently built a Python+Spark application to do search relevance >>>> > analytics. I use spark-submit to submit PySpark jobs to a Spark >>>> cluster on >>>> > EC2 (so I don't use the PySpark shell, hopefully thats what you are >>>> looking >>>> > for). Can't share the code, but the basic approach is covered in this >>>> blog >>>> > post - scroll down to the section "Writing a Spark Application". >>>> > >>>> > >>>> https://districtdatalabs.silvrback.com/getting-started-with-spark-in-python >>>> > >>>> > Hope this helps, >>>> > >>>> > -sujit >>>> > >>>> > >>>> > On Wed, Jul 8, 2015 at 7:46 AM, Julian <julian+sp...@magnetic.com> >>>> wrote: >>>> >> >>>> >> Hey. >>>> >> >>>> >> Is there a resource that has written up what the necessary steps are >>>> for >>>> >> running PySpark without using the PySpark shell? >>>> >> >>>> >> I can reverse engineer (by following the tracebacks and reading the >>>> shell >>>> >> source) what the relevant Java imports needed are, but I would assume >>>> >> someone has attempted this before and just published something I can >>>> >> either >>>> >> follow or install? If not, I have something that pretty much works >>>> and can >>>> >> publish it, but I'm not a heavy Spark user, so there may be some >>>> things >>>> >> I've >>>> >> left out that I haven't hit because of how little of pyspark I'm >>>> playing >>>> >> with. >>>> >> >>>> >> Thanks, >>>> >> Julian >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> View this message in context: >>>> >> >>>> http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-without-PySpark-tp23719.html >>>> >> Sent from the Apache Spark User List mailing list archive at >>>> Nabble.com. >>>> >> >>>> >> --------------------------------------------------------------------- >>>> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> >> For additional commands, e-mail: user-h...@spark.apache.org >>>> >> >>>> > >>>> >>> >>> >> >
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org