Hi Ashish, >> Nice post. Agreed, kudos to the author of the post, Benjamin Benfort of District Labs.
>> Following your post, I get this problem; Again, not my post. I did try setting up IPython with the Spark profile for the edX Intro to Spark course (because I didn't want to use the Vagrant container) and it worked flawlessly with the instructions provided (on OSX). I haven't used the IPython/PySpark environment beyond very basic tasks since then though, because my employer has a Databricks license which we were already using for other stuff and we ended up doing the labs on Databricks. Looking at your screenshot though, I don't see why you think its picking up the default profile. One simple way of checking to see if things are working is to open a new notebook and try this sequence of commands: from pyspark import SparkContext sc = SparkContext("local", "pyspark") sc You should see something like this after a little while: <pyspark.context.SparkContext at 0x1093c9b10> While the context is being instantiated, you should also see lots of log lines scroll by on the terminal where you started the "ipython notebook --profile spark" command - these log lines are from Spark. Hope this helps, Sujit On Wed, Jul 8, 2015 at 6:04 PM, Ashish Dutt <ashish.du...@gmail.com> wrote: > Hi Sujit, > Nice post.. Exactly what I had been looking for. > I am relatively a beginner with Spark and real time data processing. > We have a server with CDH5.4 with 4 nodes. The spark version in our server > is 1.3.0 > On my laptop I have spark 1.3.0 too and its using Windows 7 environment. > As per point 5 of your post I am able to invoke pyspark locally as in a > standalone mode. > > Following your post, I get this problem; > > 1. In section "Using Ipython notebook with spark" I cannot understand why > it is picking up the default profile and not the pyspark profile. I am sure > it is because of the path variables. Attached is the screenshot. Can you > suggest how to solve this. > > Current the path variables for my laptop are like > SPARK_HOME="C:\SPARK-1.3.0\BIN", JAVA_HOME="C:\PROGRAM > FILES\JAVA\JDK1.7.0_79", HADOOP_HOME="D:\WINUTILS", M2_HOME="D:\MAVEN\BIN", > MAVEN_HOME="D:\MAVEN\BIN", PYTHON_HOME="C:\PYTHON27\", SBT_HOME="C:\SBT\" > > > Sincerely, > Ashish Dutt > PhD Candidate > Department of Information Systems > University of Malaya, Lembah Pantai, > 50603 Kuala Lumpur, Malaysia > > On Thu, Jul 9, 2015 at 4:56 AM, Sujit Pal <sujitatgt...@gmail.com> wrote: > >> You are welcome Davies. Just to clarify, I didn't write the post (not >> sure if my earlier post gave that impression, apologize if so), although I >> agree its great :-). >> >> -sujit >> >> >> On Wed, Jul 8, 2015 at 10:36 AM, Davies Liu <dav...@databricks.com> >> wrote: >> >>> Great post, thanks for sharing with us! >>> >>> On Wed, Jul 8, 2015 at 9:59 AM, Sujit Pal <sujitatgt...@gmail.com> >>> wrote: >>> > Hi Julian, >>> > >>> > I recently built a Python+Spark application to do search relevance >>> > analytics. I use spark-submit to submit PySpark jobs to a Spark >>> cluster on >>> > EC2 (so I don't use the PySpark shell, hopefully thats what you are >>> looking >>> > for). Can't share the code, but the basic approach is covered in this >>> blog >>> > post - scroll down to the section "Writing a Spark Application". >>> > >>> > >>> https://districtdatalabs.silvrback.com/getting-started-with-spark-in-python >>> > >>> > Hope this helps, >>> > >>> > -sujit >>> > >>> > >>> > On Wed, Jul 8, 2015 at 7:46 AM, Julian <julian+sp...@magnetic.com> >>> wrote: >>> >> >>> >> Hey. >>> >> >>> >> Is there a resource that has written up what the necessary steps are >>> for >>> >> running PySpark without using the PySpark shell? >>> >> >>> >> I can reverse engineer (by following the tracebacks and reading the >>> shell >>> >> source) what the relevant Java imports needed are, but I would assume >>> >> someone has attempted this before and just published something I can >>> >> either >>> >> follow or install? If not, I have something that pretty much works >>> and can >>> >> publish it, but I'm not a heavy Spark user, so there may be some >>> things >>> >> I've >>> >> left out that I haven't hit because of how little of pyspark I'm >>> playing >>> >> with. >>> >> >>> >> Thanks, >>> >> Julian >>> >> >>> >> >>> >> >>> >> -- >>> >> View this message in context: >>> >> >>> http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-without-PySpark-tp23719.html >>> >> Sent from the Apache Spark User List mailing list archive at >>> Nabble.com. >>> >> >>> >> --------------------------------------------------------------------- >>> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> >> For additional commands, e-mail: user-h...@spark.apache.org >>> >> >>> > >>> >> >> >