Hi All, My raw data looks like this: DateTime,OtherData 01-01-2000-01:00:00,blablabla1 01-01-2000-04:00:00,blablabla2 01-02-2000-02:00:00,blablabla3
I would like to partition on the datepart of DateTime. What does *not *work, unfortunately, is this: Create table mytable (DateTime string, OtherData string) Partition by (*substr(DateTime,1,10)* string); I *wish* my raw data instead looked like: Date*,Time*,OtherData 01-01-2000*,01:00:00*,blablabla1 01-01-2000*,04:00:00*,blablabla2 01-02-2000*,02:00:00*,blablabla3 ...with Time a distinct field. Then I could use: Create table mytable (Time string, OtherData string) Partition by (Date string); Any ideas for the best way to load my raw data into a hive table partitioned by the datepart of DateTime? The files are gynormous, so manipulating the raw data outside of Hive is not feasible for this problem. I would like to avoid using Select in the solution as well since my hive table will refer to zipped data (and the Select therefore would come with a big runtime cost). Thanks!! Dan