[ 
https://issues.apache.org/jira/browse/HIVE-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8401:
-----------------------------------
    Fix Version/s:     (was: 0.15.0)

> OrcFileMergeOperator only close last orc file it opened, which resulted in 
> stale data in table directory
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-8401
>                 URL: https://issues.apache.org/jira/browse/HIVE-8401
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>         Environment: Windows Server
>            Reporter: Xiaobing Zhou
>            Assignee: Xiaobing Zhou
>            Priority: Critical
>             Fix For: 0.14.0
>
>         Attachments: HIVE-8401.1.patch, alter_merge_2_orc.q.out
>
>
> run the test
> {noformat}
> mvn -Phadoop-2  test -Dtest=TestCliDriver -Dqfile=alter_merge_2_orc.q
> {noformat}
> to reproduce it. Simply, this query does three data loads which generates 
> three orc files, ALTER TABLE CONCATENATE tries to merge orc pieces into a 
> single one which is final file to queried.
> Output 
> \hive\itests\qtest\target\qfile-results\clientpositive\alter_merge_2_orc.q.out
>  shows # records as 600 that is wrong as opposed to 610 expected.
> Because OrcFileMergeOperator only closes last orc file, the 1st and 2nd orc 
> files still remain in table directory due to failure of deleting unclosed 
> file for old data clean when MoveTask tries to copy merged orc file from 
> scratch dir to table dir. Eventually the query goes to old data(1st and 2nd 
> orc files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to