Hi Sujit,
Nice post.. Exactly what I had been looking for.
I am relatively a beginner with Spark and real time data processing.
We have a server with CDH5.4 with 4 nodes. The spark version in our server
is 1.3.0
On my laptop I have spark 1.3.0 too and its using Windows 7 environment. As
per point 5 of your post I am able to invoke pyspark locally as in a
standalone mode.

Following your post, I get this problem;

1. In section "Using Ipython notebook with spark" I cannot understand why
it is picking up the default profile and not the pyspark profile. I am sure
it is because of the path variables. Attached is the screenshot. Can you
suggest how to solve this.

Current the path variables for my laptop are like
SPARK_HOME="C:\SPARK-1.3.0\BIN", JAVA_HOME="C:\PROGRAM
FILES\JAVA\JDK1.7.0_79", HADOOP_HOME="D:\WINUTILS", M2_HOME="D:\MAVEN\BIN",
MAVEN_HOME="D:\MAVEN\BIN", PYTHON_HOME="C:\PYTHON27\", SBT_HOME="C:\SBT\"


Sincerely,
Ashish Dutt
PhD Candidate
Department of Information Systems
University of Malaya, Lembah Pantai,
50603 Kuala Lumpur, Malaysia

On Thu, Jul 9, 2015 at 4:56 AM, Sujit Pal <sujitatgt...@gmail.com> wrote:

> You are welcome Davies. Just to clarify, I didn't write the post (not sure
> if my earlier post gave that impression, apologize if so), although I agree
> its great :-).
>
> -sujit
>
>
> On Wed, Jul 8, 2015 at 10:36 AM, Davies Liu <dav...@databricks.com> wrote:
>
>> Great post, thanks for sharing with us!
>>
>> On Wed, Jul 8, 2015 at 9:59 AM, Sujit Pal <sujitatgt...@gmail.com> wrote:
>> > Hi Julian,
>> >
>> > I recently built a Python+Spark application to do search relevance
>> > analytics. I use spark-submit to submit PySpark jobs to a Spark cluster
>> on
>> > EC2 (so I don't use the PySpark shell, hopefully thats what you are
>> looking
>> > for). Can't share the code, but the basic approach is covered in this
>> blog
>> > post - scroll down to the section "Writing a Spark Application".
>> >
>> >
>> https://districtdatalabs.silvrback.com/getting-started-with-spark-in-python
>> >
>> > Hope this helps,
>> >
>> > -sujit
>> >
>> >
>> > On Wed, Jul 8, 2015 at 7:46 AM, Julian <julian+sp...@magnetic.com>
>> wrote:
>> >>
>> >> Hey.
>> >>
>> >> Is there a resource that has written up what the necessary steps are
>> for
>> >> running PySpark without using the PySpark shell?
>> >>
>> >> I can reverse engineer (by following the tracebacks and reading the
>> shell
>> >> source) what the relevant Java imports needed are, but I would assume
>> >> someone has attempted this before and just published something I can
>> >> either
>> >> follow or install? If not, I have something that pretty much works and
>> can
>> >> publish it, but I'm not a heavy Spark user, so there may be some things
>> >> I've
>> >> left out that I haven't hit because of how little of pyspark I'm
>> playing
>> >> with.
>> >>
>> >> Thanks,
>> >> Julian
>> >>
>> >>
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-without-PySpark-tp23719.html
>> >> Sent from the Apache Spark User List mailing list archive at
>> Nabble.com.
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> >> For additional commands, e-mail: user-h...@spark.apache.org
>> >>
>> >
>>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to