Hi Hive users, I am evaluating the performance of Hive 4.0.1 and 4.1, and I wonder if any team has deployed Hive-LLAP in production (using vanilla Hive) and observed a similar level of performance improvement when switching from Hive-Tez to Hive-LLAP.
In the attached Excel document, I report the result of running the 10TB TPC-DS benchmark in a 13-node cluster using the following systems, including two reference systems: 1. Hive 4.0.1 on Tez, Java 8 2. Hive-LLAP 4.0.1, Java 8 3. Hive 4.1 on Tez, Java 22 4. Hive-LLAP 4.1, Java 22 5. HDP 3.1.0, Java 8 (Hortonworks Data Platform, based on Hive 3.1 with lots of patches backported) 6. Spark 4.0.0, Java 22 >From Hive 4.0.1 on Tez vs Hive-LLAP 4.0.1, the total running time decreases from 12706s to 10019s (about 20%), which is not quite impressive, especially considering the overhead of creating worker containers for each query in Hive on Tez. On the other hand, the geometric mean decreases from 56s to 31s, so the result seems to be reasonable. For comparison, HDP 3.1.0, which was released more than 5 years ago and is based on Hive 3.1, finishes in 9158s. (You can add a few hundred seconds because it uses a slight variant of the TPC-DS benchmark.) A similar observation can be made on Hive 4.1. The total running time decreases from 10912s to 8262s (about 25%), and the geometric mean decreases from 50s to 26.9s. Note that Hive-LLAP 4.1 is faster than HDP 3.1.0, but it uses Java 22 instead of Java 8. (Query 72 fails due to MapJoinMemoryExhaustionError.) So, do you think that this result aligns with your expectations for Hive-Tez vs Hive-LLAP? Thanks, Sungwoo
tpcds.tez.xlsx
Description: MS-Excel 2007 spreadsheet