Ferdinand Xu created HIVE-14919: ----------------------------------- Summary: Improve the performance of Hive on Spark 2.0.0 Key: HIVE-14919 URL: https://issues.apache.org/jira/browse/HIVE-14919 Project: Hive Issue Type: Improvement Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: benchmark.xlsx
In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to run benchmark over 10 GB data set comparing with Spark 1.6. We can see quite some performance degradations for all the queries of BigBench. For detailed information, please see the attached files. This JIRA is the umbrella ticket addressing those performance issues. [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench -- This message was sent by Atlassian JIRA (v6.3.4#6332)