[
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068794#comment-13068794
]
[email protected] commented on HIVE-2296:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1155/
-----------------------------------------------------------
Review request for hive and Siying Dong.
Summary
-------
Fixes problem of bad compressed file names by stripping off the file format (ex
".gz") and reappending it to the path later.
This addresses bug HIVE-2296.
https://issues.apache.org/jira/browse/HIVE-2296
Diffs
-----
trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1148973
trunk/ql/src/test/queries/clientpositive/insert_compressed.q PRE-CREATION
trunk/ql/src/test/results/clientpositive/insert_compressed.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/1155/diff
Testing
-------
Unit tests pass
Thanks,
Franklin
> bad compressed file names from insert into
> ------------------------------------------
>
> Key: HIVE-2296
> URL: https://issues.apache.org/jira/browse/HIVE-2296
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Franklin Hu
> Assignee: Franklin Hu
> Attachments: hive-2296.1.patch, hive-2296.2.patch
>
>
> When INSERT INTO is run on a table with compressed output
> (hive.exec.compress.output=true) and existing files in the table, it may copy
> the new files in bad file names:
> Before INSERT INTO:
> 000000_0.gz
> After INSERT INTO:
> 000000_0.gz
> 000000_0.gz_copy_1
> This causes corrupted output when doing a SELECT * on the table.
> Correct behavior should be to pick a valid filename such as:
> 000000_0_copy_1.gz
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira