[ 
https://issues.apache.org/jira/browse/HIVE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-14919:
--------------------------------
    Description: 
In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel 
BigBench[1] to run benchmark with Spark 2.0 over 1 TB data set comparing with 
Spark 1.6. We can see performance improvments about 5.4% in general and 45% for 
the best case. However, some queries doesn't have significant performance 
improvements.  This JIRA is the umbrella ticket addressing those performance 
issues.

[1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench

  was:
In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel 
BigBench[1] to run benchmark with Spark 2.0 over 10 GB data set comparing with 
Spark 1.6. We can see quite some performance degradation for most of the 
queries for BigBench. For detailed information, please see the attached file 
for detailed information. This JIRA is the umbrella ticket addressing those 
performance issues.

[1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench


> Improve the performance of Hive on Spark 2.0.0
> ----------------------------------------------
>
>                 Key: HIVE-14919
>                 URL: https://issues.apache.org/jira/browse/HIVE-14919
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ferdinand Xu
>            Assignee: Ferdinand Xu
>
> In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel 
> BigBench[1] to run benchmark with Spark 2.0 over 1 TB data set comparing with 
> Spark 1.6. We can see performance improvments about 5.4% in general and 45% 
> for the best case. However, some queries doesn't have significant performance 
> improvements.  This JIRA is the umbrella ticket addressing those performance 
> issues.
> [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to