Hi All,

My raw data looks like this:
   DateTime,OtherData
   01-01-2000-01:00:00,blablabla1
   01-01-2000-04:00:00,blablabla2
   01-02-2000-02:00:00,blablabla3

I would like to partition on the datepart of DateTime.  What does *not *work,
unfortunately, is this:
   Create table mytable (DateTime string, OtherData string)
   Partition by (*substr(DateTime,1,10)* string);

I *wish* my raw data instead looked like:
   Date*,Time*,OtherData
   01-01-2000*,01:00:00*,blablabla1
   01-01-2000*,04:00:00*,blablabla2
   01-02-2000*,02:00:00*,blablabla3

...with Time a distinct field.  Then I could use:
   Create table mytable (Time string, OtherData string)
   Partition by (Date string);

Any ideas for the best way to load my raw data into a hive table
partitioned by the datepart of DateTime?  The files are gynormous, so
manipulating the raw data outside of Hive is not feasible for this problem.
 I would like to avoid using Select in the solution as well since my hive
table will refer to zipped data (and the Select therefore would come with a
big runtime cost).

Thanks!!
Dan

Reply via email to