Re: TPC-H queries on Hive 0.12

Yin Huai Fri, 22 Nov 2013 11:45:17 -0800

I remember that textfiles are used in those scripts. With 0.12, I think ORC
should be used. Also, I think those sub-queries should be merged into a
single query. With a single query, if a reduce join is converted to a map
join, this map join can be merged to its child job. But, if this join is
evaluated by an individual query, hive has to use a single map only job to
evaluate it because it does not know this map only job is used to generate
intermediate results. For query 17 and query 18, with a single query,
Correlation Optimizer should be able to optimize these two queries (set
hive.optimize.correlation=true).


Thanks,

Yin


On Fri, Nov 22, 2013 at 1:31 PM, Avrilia Floratou <
avrilia.flora...@gmail.com> wrote:

> Hello,
>
> I'd like to run a few TPC-H queries on Hive 0.12. I've found the TPC-H
> scripts here:
>
> https://issues.apache.org/jira/browse/HIVE-600.
>
> but noticed that these scripts were generated a long time ago. Since Hive
> could not support full SQL-92 specification some queries were split into
> smaller sub-queries whose results have been materialized. Is there any
> change in HiveQL (in Hive 0.12) that would affect the way the TPC-H queries
> are written?
>
> Thanks,
> Avrilia
>

Re: TPC-H queries on Hive 0.12

Reply via email to