[ 
https://issues.apache.org/jira/browse/HIVE-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8368:
-----------------------------
    Attachment: HIVE-8367.patch

The issue comes out when input sizes are large enough that they exceed one map 
task.  

This patch fixes it by turning on reduce deduplication in the optimizer (which 
was being turned off before) and dropping the minimum number of reducers to 1 
(instead of 4).  This has the side effect of halving the time it takes to do an 
update or delete.

> compactor is improperly writing delete records in base file
> -----------------------------------------------------------
>
>                 Key: HIVE-8368
>                 URL: https://issues.apache.org/jira/browse/HIVE-8368
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 0.14.0
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>            Priority: Critical
>             Fix For: 0.14.0
>
>         Attachments: HIVE-8367.patch
>
>
> When the compactor reads records from the base and deltas, it is not properly 
> dropping delete records.  This leads to oversized base files, and possibly to 
> wrong query results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to