reduceLocality.enabled is the configuration of Spark, not
>> Spark SQL.
>>
>>
>>
>> From: Todd [mailto:bit1...@163.com]
>> Sent: Friday, September 11, 2015 3:39 PM
>> To: Todd
>> Cc: Cheng, Hao; Jesse F Chen; Michael Armbrust; user@spark.apache.org
>> S
> Davies Liu ---09/11/2015 10:41:23 AM---On Fri, Sep 11, 2015 at 10:31 AM,
> Jesse F Chen wrote: >
>
> From: Davies Liu
> To: Jesse F Chen/San Francisco/IBM@IBMUS
> Cc: "Cheng, Hao" , Todd , Michael
> Armbrust , "user@spark.apache.org"
>
ancisco/IBM@IBMUS
Cc: "Cheng, Hao" , Todd ,
Michael Armbrust ,
"user@spark.apache.org"
Date: 09/11/2015 10:41 AM
Subject: Re: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by
50%+ compared with spark 1.4.1 SQL
On Fri, Se
n/San Francisco/IBM@IBMUS, Michael Armbrust
> , "user@spark.apache.org"
> Date: 09/11/2015 01:00 AM
> Subject: RE: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by 50%+
> compared with spark 1.4.1 SQL
>
>
>
>
>
>
;
> From: Todd [mailto:bit1...@163.com]
> Sent: Friday, September 11, 2015 3:39 PM
> To: Todd
> Cc: Cheng, Hao; Jesse F Chen; Michael Armbrust; user@spark.apache.org
> Subject: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by 50%+
> compared with spark 1.4.1 SQL
>
>
l Armbrust
, "user@spark.apache.org"
Date: 09/11/2015 01:00 AM
Subject:RE: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by
50%+ compared with spark 1.4.1 SQL
Can you confirm if the query really run in the cluster mode? Not the local
1.5 SQL slows down dramatically by 50%+
compared with spark 1.4.1 SQL
I add the following two options:
spark.sql.planner.sortMergeJoin=false
spark.shuffle.reduceLocality.enabled=false
But it still performs the same as not setting them two.
One thing is that on the spark ui, when I click the
, September 11, 2015 3:39 PM
To: Todd
Cc: Cheng, Hao; Jesse F Chen; Michael Armbrust; user@spark.apache.org
Subject: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by 50%+
compared with spark 1.4.1 SQL
I add the following two options:
spark.sql.planner.sortMergeJoin=false
om]
Sent: Friday, September 11, 2015 2:17 PM
To: Cheng, Hao
Cc: Jesse F Chen; Michael Armbrust; user@spark.apache.org
Subject: Re:RE: spark 1.5 SQL slows down dramatically by 50%+ compared with
spark 1.4.1 SQL
Thanks Hao for the reply.
I turn the merge sort join off, the physical plan is
rg
Subject: Re:RE: spark 1.5 SQL slows down dramatically by 50%+ compared with
spark 1.4.1 SQL
Thanks Hao for the reply.
I turn the merge sort join off, the physical plan is below, but the performance
is roughly the same as it on...
== Physical Plan ==
TungstenProject
[ss_quantity#10,ss_lis
brust
Cc: Todd; user@spark.apache.org
Subject: Re: spark 1.5 SQL slows down dramatically by 50%+ compared with spark
1.4.1 SQL
Could this be a build issue (i.e., sbt package)?
If I ran the same jar build for 1.4.1 in 1.5, I am seeing large regression too
in queries (all other things identical)...
I am c
.
From: Todd [mailto:bit1...@163.com]
Sent: Friday, September 11, 2015 2:17 PM
To: Cheng, Hao
Cc: Jesse F Chen; Michael Armbrust; user@spark.apache.org
Subject: Re:RE: spark 1.5 SQL slows down dramatically by 50%+ compared with
spark 1.4.1 SQL
Thanks Hao for the reply.
I turn the merge sort join off
apache.org
Subject: Re: spark 1.5 SQL slows down dramatically by 50%+ compared with spark
1.4.1 SQL
Could this be a build issue (i.e., sbt package)?
If I ran the same jar build for 1.4.1 in 1.5, I am seeing large regression too
in queries (all other things identical)...
I am curious, to build 1.
@spark.apache.org
Subject: Re: spark 1.5 SQL slows down dramatically by 50%+ compared with spark
1.4.1 SQL
Could this be a build issue (i.e., sbt package)?
If I ran the same jar build for 1.4.1 in 1.5, I am seeing large regression too
in queries (all other things identical)...
I am curious, to build 1.5 (when
ecial parameters i should be using to make sure I load the latest
hive dependencies?
From: Michael Armbrust
To: Todd
Cc: "user@spark.apache.org"
Date: 09/10/2015 11:07 AM
Subject: Re: spark 1.5 SQL slows down dramatically by 50%+ compared with
spark 1.
Thanks Michael for the reply.
Below is the sql plan for 1.5 and 1.4. 1.5 is using SortMergeJoin, while 1.4.1
is using shuffled hash join.
In this case, it seems hash join performs better than sort join.
I've been running TPC-DS SF=1500 daily on Spark 1.4.1 and Spark 1.5 on S3,
so this is surprising. In my experiments Spark 1.5 is either the same or
faster than 1.4 with only small exceptions. A few thoughts,
- 600 partitions is probably way too many for 6G of data.
- Providing the output of ex
Hi,
I am using data generated with
sparksqlperf(https://github.com/databricks/spark-sql-perf) to test the spark
sql performance (spark on yarn, with 10 nodes) with the following code (The
table store_sales is about 90 million records, 6G in size)
val outputDir="hdfs://tmp/spark_perf/scaleFact
18 matches
Mail list logo