I'm really confused and could use help understanding. The Hive documentation
here
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables
says:
"Bucketed tables are fantastic in that they allow much more efficient sampling
than do non-bucketed tables, and they may l
Gopal,
Thanks for taking the time to try and help. A few things in relation to your
response:
* Yes, the 'epoch' column is an hourly timestamp. Clustering by a column with
high cardinality would make little sense.
* I'm interested in your statement that CLUSTERED BY does not CLUSTER BY. My