> Or a simple insert will be automatically sorted as the table DDL mention ?
Simple insert should do the sorting, older versions of Hive had ability to
disable that (which is a bad thing & therefore these settings are now just
hard-configed to =true in Hive3.x)
-- set hive.enforce.bucketing=true;
-- set hive.enforce.sorting=true;
It will pick 8 reducers are the default count, which might not work for the #
of partitions you have.
set hive.optimize.sort.dynamic.partition=true;
is what was used to fix these sort of reducer count issues when you are using
bucketing + partitioning on a table (using bucketing without partitioning
doesn't need that).
With every test run I end up inserting 3Tb or so into 2500 partitions using
these settings.
https://github.com/hortonworks/hive-testbench/blob/hdp3/settings/load-partitioned.sql
Cheers,
Gopal