[ 
https://issues.apache.org/jira/browse/HIVE-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8367:
-----------------------------
    Attachment: HIVE-8367.patch

The issue comes out when input sizes are large enough that they exceed one map 
task.
This patch fixes it by turning on reduce deduplication in the optimizer (which 
was being turned off before) and dropping the minimum number of reducers to 1 
(instead of 4). This has the side effect of halving the time it takes to do an 
update or delete.

> delete writes records in wrong order in some cases
> --------------------------------------------------
>
>                 Key: HIVE-8367
>                 URL: https://issues.apache.org/jira/browse/HIVE-8367
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.14.0
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: HIVE-8367.patch
>
>
> I have found one query with 10k records where you do:
> create table
> insert into table -- 10k records
> delete from table -- just some records
> The records in the delete delta are not ordered properly by rowid.
> I assume this applies to updates as well, but I haven't tested it yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to