>From average response time analysis: For Spark, it performs better than its total execution time suggests, with an average response time significantly lower than Hive on Tez.
For long-running complex queries (like query 24) on large datasets, Hive on Tez can be a better choice than Spark, even with its initial overhead of starting YARN containers. --- Sungwoo On Tue, Apr 22, 2025 at 2:52 PM ypeng <yp...@t-online.de> wrote: > > Thanks for the doc. > I am surprised to see spark 4 is even slower than hive on Tez. > > > [Total Execution Time (Sequential). Trino is the fastest, followed > closely by Hive on MR3, which significantly outperformed Hive on Tez. > Spark is the slowest, skewed by a few outlier queries.] > > > Sungwoo Park: > > We published a blog that reports the performance evaluation of Trino > > 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3 2.0 using the TPC-DS > > Benchmark, 10TB scale factor. Hope you find it useful. > >