[ https://issues.apache.org/jira/browse/HIVE-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated HIVE-8367: ----------------------------- Attachment: HIVE-8367.patch The issue comes out when input sizes are large enough that they exceed one map task. This patch fixes it by turning on reduce deduplication in the optimizer (which was being turned off before) and dropping the minimum number of reducers to 1 (instead of 4). This has the side effect of halving the time it takes to do an update or delete. > delete writes records in wrong order in some cases > -------------------------------------------------- > > Key: HIVE-8367 > URL: https://issues.apache.org/jira/browse/HIVE-8367 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.14.0 > Reporter: Alan Gates > Assignee: Alan Gates > Priority: Blocker > Fix For: 0.14.0 > > Attachments: HIVE-8367.patch > > > I have found one query with 10k records where you do: > create table > insert into table -- 10k records > delete from table -- just some records > The records in the delete delta are not ordered properly by rowid. > I assume this applies to updates as well, but I haven't tested it yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)