Hi Ashish,
Cool. glad it worked out. I have only used Spark clusters on EC2, which I
spin up using the spark-ec2 scripts (part of the Spark downloads). So don't
have any experience setting up inhouse clusters like you want to do. But I
found some documentation here that may be helpful.
https://doc
Hi Ashish,
Julian's approach is probably better, but few observations:
1) Your SPARK_HOME should be C:\spark-1.3.0 (not C:\spark-1.3.0\bin).
2) If you have anaconda python installed (I saw that you had set this up in
a separate thread, py4j should be part of the package - at least I think
so. To
Hi Ashish,
Your 00-pyspark-setup file looks very different from mine (and from the one
described in the blog post). Questions:
1) Do you have SPARK_HOME set up in your environment? Because if not, it
sets it to None in your code. You should provide the path to your Spark
installation. In my case
Hi Sujit,
Thanks for your response.
So i opened a new notebook using the command ipython notebook --profile
spark and tried the sequence of commands. i am getting errors. Attached is
the screenshot of the same.
Also I am attaching the 00-pyspark-setup.py for your reference. Looks
like, I have wri
Very interesting and well organized post. Thanks for sharing
On Wed, Jul 8, 2015 at 10:29 PM, Sujit Pal wrote:
> Hi Julian,
>
> I recently built a Python+Spark application to do search relevance
> analytics. I use spark-submit to submit PySpark jobs to a Spark cluster on
> EC2 (so I don't use th
Hi Ashish,
>> Nice post.
Agreed, kudos to the author of the post, Benjamin Benfort of District Labs.
>> Following your post, I get this problem;
Again, not my post.
I did try setting up IPython with the Spark profile for the edX Intro to
Spark course (because I didn't want to use the Vagrant con
Hi Sujit,
Nice post.. Exactly what I had been looking for.
I am relatively a beginner with Spark and real time data processing.
We have a server with CDH5.4 with 4 nodes. The spark version in our server
is 1.3.0
On my laptop I have spark 1.3.0 too and its using Windows 7 environment. As
per point 5
You are welcome Davies. Just to clarify, I didn't write the post (not sure
if my earlier post gave that impression, apologize if so), although I agree
its great :-).
-sujit
On Wed, Jul 8, 2015 at 10:36 AM, Davies Liu wrote:
> Great post, thanks for sharing with us!
>
> On Wed, Jul 8, 2015 at 9
Great post, thanks for sharing with us!
On Wed, Jul 8, 2015 at 9:59 AM, Sujit Pal wrote:
> Hi Julian,
>
> I recently built a Python+Spark application to do search relevance
> analytics. I use spark-submit to submit PySpark jobs to a Spark cluster on
> EC2 (so I don't use the PySpark shell, hopefu
Hi Julian,
I recently built a Python+Spark application to do search relevance
analytics. I use spark-submit to submit PySpark jobs to a Spark cluster on
EC2 (so I don't use the PySpark shell, hopefully thats what you are looking
for). Can't share the code, but the basic approach is covered in this
10 matches
Mail list logo