Re: Hive on Tez much slower than MR

2015-08-06 Thread William Slacum
in my cluster. The >> documentation regarding this is odd as it says ORC is required, but none of >> my tables are using ORC. >> >> >> >> On Aug 5, 2015, at 3:48 PM, William Slacum wrote: >> >> Hi all, >> >> I'm using Hive 0.14, Tez

Hive on Tez much slower than MR

2015-08-05 Thread William Slacum
Hi all, I'm using Hive 0.14, Tez 0.5.2, and Hadoop 2.6.0. I have a very simple query of the form `select count(*) from my_table where x > 0 and x < 1500`. The table has ~50 columns in it and not all are populated. My total dataset size is ~20TB. When I run with MapReduce, I can generally see a m

Origin of hive.auto.convert.sortmerge.join.noconditionaltask

2015-08-04 Thread William Slacum
Hi all, I've had some questions from users regarding setting `hive.auto.convert.sortmerge.join.noconditionaltask`. I see, in some documentation from users and vendors, that it is recommended to set this parameter. In neither Hive 0.12 nor 0.14 can I find in HiveConf where this is actually defined