[ https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161227#comment-14161227 ]
Thiruvel Thirumoolan commented on HIVE-8371: -------------------------------------------- [~sushanth] Lemme know what do you think about this. > HCatStorer should fail by default when publishing to an existing partition > -------------------------------------------------------------------------- > > Key: HIVE-8371 > URL: https://issues.apache.org/jira/browse/HIVE-8371 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.13.0, 0.14.0, 0.13.1 > Reporter: Thiruvel Thirumoolan > Assignee: Thiruvel Thirumoolan > Labels: hcatalog, partition > > In Hive-12 and before (on in previous HCatalog releases) HCatStorer would > fail if the partition already exists (whether before launching the job or > during commit depending on the partitioning). HIVE-6406 changed that behavior > and by default does an append. This causes data quality issues since an rerun > (or duplicate run) won't fail (when it used to) and will just append to the > partition. > A preferable approach would be to leave HCatStorer behavior as is (fail > during a duplicate publish) and support append through an option. Overwrite > also can be implemented in a similar fashion. Eg: > store A into 'db.table' using > org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append'); -- This message was sent by Atlassian JIRA (v6.3.4#6332)