Long story short, regarding the performance issue, it appeared with
recompiled version of the source TGZ downloaded from spark website.
Problem disappears with 1.6.2-SNAPSHOT (branch-1.6)
Guillaume
Do you have code which can reproduce this performance drop in
treeReduce? It would be helpful t
eByKey_, let's
say partitions are numbered like that :
(0,p0),(1,p1),(2,p2),(3,p3)
Then after the modulo, (0,p0),(1,p1),(0,p2),(1,p3)
As a consequence, W1 will shuffle p2 to W0 and W0 will shuffle p1 to W1.
Guillaume
On Wed, Apr 27, 2016 at 4:46 AM, Guillaume Pitel
mailto:guillaume.pi
eems below optimality to me. There is a huge shuffle cost,
while a simple coalesce followed by a partition-level aggregation would
probably perfectly do the job.
Have I missed something that requires to do this reshuffle ?
Best regards
Guillaume Pitel
his for map joins in Hive on Spark
and Spark SQL.
-Sandy
On Sat, Mar 21, 2015 at 3:11 AM, Guillaume Pitel
wrote:
Hi,
Thanks for your answer. This is precisely the use case I'm interested in,
but I know it already, I should have mentionned it. Unfortunately this
implementation of BlockM
/ml-matrix
Another lib:
https://github.com/PasaLab/marlin/blob/master/README.md
—
Sent from Mailbox
On Sat, Mar 21, 2015 at 12:24 AM, Guillaume Pitel
wrote:
Hi,
I have an idea that I would like to discuss with the Spark devs. The
idea comes from a very real problem that I have stru
Guillaume
--
eXenSa
*Guillaume PITEL, Président*
+33(0)626 222 431
eXenSa S.A.S. <http://www.exensa.com/>
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)184 163 677 / Fax +33(0)972 283 705
g>
For additional commands, e-mail: user-h...@spark.apache.org
<mailto:user-h...@spark.apache.org>
--
eXenSa
*Guillaume PITEL, Président*
+33(0)626 222 431
eXenSa S.A.S. <http://www.exensa.com/>
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)184 163 677 / Fax +33(0)972 283 705
and nodes?
Matei
On Oct 7, 2014, at 11:42 AM, Davies Liu wrote:
Could you create a JIRA for it? maybe it's a regression after
https://issues.apache.org/jira/browse/SPARK-3119.
We will appreciate that if you could tell how to reproduce it.
On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
ation error from our side, but are unable to pin it down. Does someone
have any idea of the origin of the problem ?
For now we're sticking with the HttpBroadcast workaround.
Guillaume
--
eXenSa
*Guillaume PITEL, Président*
+33(0)626 222 431
eXenSa S.A.S. <http://www.exensa.