> Correct hence the question as I have done some preliminary tests on Hive >2. > I want to share insights with other people who have performed the same
If you have feedback on Hive-2.0, I'm all ears. I'm building up 2.1 features & fixes, so now would be a good time to bring stuff up. Speed mostly depends on whether you're using Hive-2.0 with LLAP or not - if you're using the old engines, the plans still get much better (even for MR). Tez does get some stuff out of it, like the new shuffle join vertex manager (hive.optimize.dynamic.partition.hashjoin). LLAP will still win that out for <10s queries, because it takes approx ~10 mins for all the auto-generated vectorized classes to get JIT'd into tight SIMD loops. For something like TPC-H Q1, you can slowly see it turning all the null checks into UncommonTrapBlob as the JIT slowly learns about the data & finds .noNulls is always true. Cheers, Gopal