Congrats to the Spark community !
On Friday, July 11, 2014, Patrick Wendell wrote:
> I am happy to announce the availability of Spark 1.0.1! This release
> includes contributions from 70 developers. Spark 1.0.0 includes fixes
> across several areas of Spark, including the core API, PySpark, and
I am happy to announce the availability of Spark 1.0.1! This release
includes contributions from 70 developers. Spark 1.0.0 includes fixes
across several areas of Spark, including the core API, PySpark, and
MLlib. It also includes new features in Spark's (alpha) SQL library,
including support for J
Hi Jai,
Your suspicion is correct. In general, Python RDDs are pickled into byte
arrays and stored in Java land as RDDs of byte arrays. union/zip operates
on byte arrays directly without deserializing. Currently, Python byte
arrays only get unpickled into Java objects in special cases, like SQL
fu
Also take a look at this:
https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals
On Fri, Jul 11, 2014 at 10:29 AM, Andrew Or wrote:
> Hi Egor,
>
> Here are a few answers to your questions:
>
> 1) Python needs to be installed on all machines, but not pyspark. The way
> the executors
Hi Egor,
Here are a few answers to your questions:
1) Python needs to be installed on all machines, but not pyspark. The way
the executors get the pyspark code depends on which cluster manager you
use. In standalone mode, your executors need to have the actual python
files in their working direct
This vote has passed with 9 +1 votes (5 binding) and 1 -1 vote (0 binding).
+1:
Patrick Wendell*
Mark Hamstra*
DB Tsai
Krishna Sankar
Soren Macbeth
Andrew Or
Matei Zaharia*
Xiangrui Meng*
Tom Graves*
0:
-1:
Gary Malouf
Okay just FYI - I'm closing this vote since many people are waiting on
the release and I was hoping to package it today. If we find a
reproducible Mesos issue here, we can definitely spin the fix into a
subsequent release.
On Fri, Jul 11, 2014 at 9:37 AM, Patrick Wendell wrote:
> Hey Gary,
>
>
Hey Gary,
Why do you think the akka frame size changed? It didn't change - we
added some fixes for cases where users were setting non-default
values.
On Fri, Jul 11, 2014 at 9:31 AM, Gary Malouf wrote:
> Hi Matei,
>
> We have not had time to re-deploy the rc today, but one thing that jumps
> out
Hi Matei,
We have not had time to re-deploy the rc today, but one thing that jumps
out is the shrinking of the default akka frame size from 10MB to around
128KB by default. That is my first suspicion for our issue - could imagine
that biting others as well.
I'll try to re-test that today - eithe
Unless you can diagnose the problem quickly, Gary, I think we need to go ahead
with this release as is. This release didn't touch the Mesos support as far as
I know, so the problem might be a nondeterministic issue with your application.
But on the other hand the release does fix some critical b
HI,
I want to write some common utility function in Scala and want to call
the same from Java/Python Spark API ( may be add some wrapper code around
scala calls). Calling Scala functions from Java works fine. I was reading
pyspark rdd code and find out that pyspark is able to call JavaRDD functio
Great. Then one question left:
what would you recommend for implementation?
2014-07-11 17:43 GMT+04:00 Chester At Work :
> Sung chung from alpine data labs presented the random Forrest
> implementation at Spark summit 2014. The work will be open sourced and
> contributed back to MLLib.
>
> Stay
Sung chung from alpine data labs presented the random Forrest implementation at
Spark summit 2014. The work will be open sourced and contributed back to MLLib.
Stay tuned
Sent from my iPad
On Jul 11, 2014, at 6:02 AM, Egor Pahomov wrote:
> Hi, I have intern, who wants to implement some ML
Hi, I have intern, who wants to implement some ML algorithm for spark.
Which algorithm would be good idea to implement(it should be not very
difficult)? I heard someone already working on random forest, but couldn't
find proof of that.
I'm aware of new politics, where we should implement stable, g
Hi, I want to use pySpark, but can't understand how it works. Documentation
doesn't provide enough information.
1) How python shipped to cluster? Should machines in cluster already have
python?
2) What happens when I write some python code in "map" function - is it
shipped to cluster and just exec
15 matches
Mail list logo