>From average response time analysis:

For Spark, it performs better than its total execution time suggests, with
an average response time significantly lower than Hive on Tez.

For long-running complex queries (like query 24) on large datasets, Hive on
Tez can be a better choice than Spark, even with its initial overhead of
starting YARN containers.

--- Sungwoo

On Tue, Apr 22, 2025 at 2:52 PM ypeng <yp...@t-online.de> wrote:

>
> Thanks for the doc.
> I am surprised to see spark 4 is even slower than hive on Tez.
>
>
> [Total Execution Time (Sequential). Trino is the fastest, followed
> closely by Hive on MR3, which significantly outperformed Hive on Tez.
> Spark is the slowest, skewed by a few outlier queries.]
>
>
> Sungwoo Park:
> > We published a blog that reports the performance evaluation of Trino
> > 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3 2.0 using the TPC-DS
> > Benchmark, 10TB scale factor. Hope you find it useful.
>
>

Reply via email to