​​> Or a simple insert will be automatically sorted as the table DDL mention ?

Simple insert should do the sorting, older versions of Hive had ability to 
disable that (which is a bad thing & therefore these settings are now just 
hard-configed to =true in Hive3.x)

-- set hive.enforce.bucketing=true;
-- set hive.enforce.sorting=true; 

It will pick 8 reducers are the default count, which might not work for the # 
of partitions you have.

set hive.optimize.sort.dynamic.partition=true;

is what was used to fix these sort of reducer count issues when you are using 
bucketing + partitioning on a table (using bucketing without partitioning 
doesn't need that).

With every test run I end up inserting 3Tb or so into 2500 partitions using 
these settings.

https://github.com/hortonworks/hive-testbench/blob/hdp3/settings/load-partitioned.sql

Cheers,
Gopal



Reply via email to