Technically you only need to change the build file, and change part of a
line in SparkEnv so you don't have to break your oath :)
On Sun, May 4, 2014 at 10:22 PM, Soren Macbeth wrote:
> that would violate my personal oath of never writing a single line of
> scala, but I might be able to do tha
that would violate my personal oath of never writing a single line of
scala, but I might be able to do that if I can get past the issue this
issue I'm struggling with in this thread.
On Sunday, May 4, 2014, Reynold Xin wrote:
> Thanks. Do you mind playing with chill-scala a little bit and see if
Thanks. Do you mind playing with chill-scala a little bit and see if it
actually works well for closures? One way to try is to hard code the
serializer to use Kryo with chill-scala, and then run through all the unit
tests.
If it works well, we can incorporate that in the next release (probably not
fwiw, it seems like it wouldn't be very difficult to integrate chill-scala,
since you're already chill-java and probably get kryo serialization of
closures and all sorts of other scala stuff for free. All that would be
needed would be to include the dependency and then update KryoSerializer to
regi
Good idea. I submitted a pull request for the doc update here:
https://github.com/apache/spark/pull/642
On Sun, May 4, 2014 at 3:54 PM, Soren Macbeth wrote:
> Thanks for the reply!
>
> Ok, if that's the case, I'd recommend a note to that affect in the docs at
> least.
>
> Just to give some more
Kryo does generate code for serialization, so the CPU overhead is quite
lower than Java (which I think just uses reflection). As I understand, they
also have a new implementation that uses unsafe intrinsics, which should
lead to even higher performance.
The generated byte[] size was a lot smaller
Hey AJ,
I have tried to run on a cluster yet, only on local mode.
I'll try to get something running on a cluster soon and keep you posted.
Nicolas Garneau
> On May 4, 2014, at 6:23 PM, Ajay Nair wrote:
>
> Now I got it to work .. well almost. However I needed to copy the project/
> folder to t
Thanks for the reply!
Ok, if that's the case, I'd recommend a note to that affect in the docs at
least.
Just to give some more context here, I'm working on a Clojure DSL for Spark
called Flambo, which I plan to open source shortly. If I could I'd like to
focus on the initial bug that I hit.
Exce
Now I got it to work .. well almost. However I needed to copy the project/
folder to the spark-standalone folder as the package build was failing
because it could not find buil properties. After the copy the build was
successful. However when I run it I get errors but it still gives me the
output.
On a slightly related note (apologies Soren for hijacking the thread),
Reynold how much better is kryo from spark's usage point of view
compared to the default java serialization (in general, not for
closures) ?
The numbers on kyro site are interesting, but since you have played
the most with kryo
I added the config option to use the non-default serializer. However, at
the time, Kryo fails serializing pretty much any closures so that option
was never really used / recommended.
Since then the Scala ecosystem has developed, and some other projects are
starting to use Kryo to serialize more Sc
Hey AJ,
If you plan to launch your job on a cluster, consider using the spark-submit
command.
Running this in the spark's home directory gives you a help on how to use this:
$ ./bin/spark-submit
I haven't tried it yet but considering this post, it will be the preferred way
to launch jobs:
http
Thank you. I am trying this now
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459p6472.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
apologies for the cross-list posts, but I've gotten zero response in the
user list and I guess this list is probably more appropriate.
According to the documentation, using the KryoSerializer for closures is
supported. However, when I try to set `spark.closure.serializer` to
`org.apache.spark.seri
Thanks DB. I will work with mapPartition for now.
Question to the community in general: should we consider adding such an
operation to RDDs especially as a developer API?
On Sun, May 4, 2014 at 1:41 AM, DB Tsai wrote:
> You could easily achieve this by mapPartition. However, it seems that it
Le 4 mai 2014 à 06:30, Matei Zaharia a écrit :
> Hi Nicolas,
>
> Good catches on these things.
>
>> Your website seems a little bit incomplete. I have found this page [1] with
>> list the two main mailing lists, users and dev. But I see a reference to a
>> mailing list about "issues" which t
You could easily achieve this by mapPartition. However, it seems that it
can not be done by using aggregate type of operation. I can see that it's a
general useful operation. For now, you could use mapPartition.
Sincerely,
DB Tsai
---
My Blog:
I am currently using the RDD aggregate operation to reduce (fold) per
partition and then combine using the RDD aggregate operation.
def aggregate[U: ClassTag](zeroValue: U)(seqOp: (U, T) => U, combOp: (U, U)
=> U): U
I need to perform a transform operation after the seqOp and before the
combOp. Th
18 matches
Mail list logo