A hive version would help to preface this, because that matters for this (like
TEZ-3709 doesn't apply for hive-1.2).
> I’m trying to simply change the format of a very large partitioned table from
> Json to ORC. I’m finding that it is unexpectedly resource intensive,
> primarily due to a shu
Hi Everyone,
I am trying to insert data from 2tables to one table as separate columns.
Example:
Table1 as A:
Id
Data
time_stamp
1
0.1
2018-01-01
2
0.2
2018-01-01
3
0.3
2018-01-02
Table2 as B
Id
Data
time_stamp
1
1.1
2018-01-01
2
2.2
2018-01-01
3
1.3
2018-01-02
Now I a
Hi Elliot,
>From your description of the problem, I'm assuming that you are doing a
INSERT OVERWRITE table PARTITION(p1, p2) SELECT * FROM table
or something close, like a CREATE TABLE AS ... maybe.
If this is the case, I suspect that your shuffle phase comes from dynamic
partitioning, and in pa