I believe many users like us would export the output from camus as a hive
external table. but the dir structure of camus is like
/YYYY/MM/DD/xxxxxx

while hive generally expects /year=YYYY/month=MM/day=DD/xxxxxx if you
define that table to be
partitioned by (year, month, day). otherwise you'd have to add those
partitions created by camus through a separate command. but in the latter
case, would a camus job create >1 partitions ? how would we find out the
YYYY/MM/DD values from outside ? ---- well you could always do something by
hadoop dfs -ls and then grep the output, but it's kind of not clean....


thanks
yang

Reply via email to