hey Sean, its becoz you are appending the file in same partition with the same name(which is not possible) you must change the file name before appending into same partition.
AFAIK, i don't think that there is any other way to do that, either you can you partition name or the file name. Thanks Vikas Srivastava On Tue, Mar 20, 2012 at 6:45 AM, Sean McNamara <sean.mcnam...@webtrends.com>wrote: > Is there a way to prevent LOAD DATA LOCAL INPATH from appending _copy_1 > to logs that already exist in a partition? If the log is already in > hdfs/hive I'd rather it fail and give me an return code or output saying > that the log already exists. > > For example, if I run these queries: > /usr/local/hive/bin/hive -e "LOAD DATA LOCAL INPATH 'test_a.bz2' INTO > TABLE logs PARTITION(ds='2012-03-19', hr='23')" > /usr/local/hive/bin/hive -e "LOAD DATA LOCAL INPATH 'test_b.bz2' INTO > TABLE logs PARTITION(ds='2012-03-19', hr='23')" > /usr/local/hive/bin/hive -e "LOAD DATA LOCAL INPATH 'test_b.bz2' INTO > TABLE logs PARTITION(ds='2012-03-19', hr='23')" > /usr/local/hive/bin/hive -e "LOAD DATA LOCAL INPATH 'test_b.bz2' INTO > TABLE logs PARTITION(ds='2012-03-19', hr='23')" > > I end up with: > test_a.bz2 > test_b.bz2 > test_b_copy_1.bz2 > test_b_copy_2.bz2 > > However, If I use OVERWRITE it will nuke all the data in the partition > (including test_a.bz2) and I end up with just: > test_b.bz2 > > I recall that older versions of hive would not do this. How do I handle > this case? Is there a safe atomic way to do this? > > Sean > > > > > > > >