Re: Hive 3 has big performance improvement from my test

2023-01-07 Thread Sungwoo Park
> > > [image: image.png] > > from your posting, the result is amazing. glad to know hive on mr3 has > that nice performance. > Hive on MR3 is similar to Hive-LLAP in performance, so we can interpret the above result as Hive being much faster than SparkSQL. For executing concurrent queries, the per

Re: Hive 3 has big performance improvement from my test

2023-01-07 Thread David
I spent some time over the past couple of years making micro optimizations within Avro, Parquet, ORC. Curious to know if there's a way for you all to get timings at different levels of the stack to compare and not just look at the top-line numbers. A further breakdown could also help identify area

Re: Hive 3 has big performance improvement from my test

2023-01-07 Thread ypeng
[image: image.png] from your posting, the result is amazing. glad to know hive on mr3 has that nice performance. regards. On Sat, Jan 7, 2023 at 11:29 PM Sungwoo Park wrote: > In fact, Hive 3 has been much faster than Spark for a long time. For > complex queries, Hive 3 is much faster than Pr

Re: Hive 3 has big performance improvement from my test

2023-01-07 Thread Mich Talebzadeh
Thanks for this insight guys. On your point below and I quote: ... "It's even as fast as Spark by using the default mr engine" OK as we are all experimentalists, are we stating that the classic MapReduce computation can outdo Spark's in-memory computation. I would be curious to know this. Than

Re: Hive 3 has big performance improvement from my test

2023-01-07 Thread Sungwoo Park
In fact, Hive 3 has been much faster than Spark for a long time. For complex queries, Hive 3 is much faster than Presto (or Trino) as well. The reality is different from common beliefs on Hive, Spark, and Presto. If interested, see the result of performance comparison using the TPC-DS benchmark. P