> On Sept. 6, 2017, 12:08 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java
> > Line 231 (original), 256 (patched)
> > <https://reviews.apache.org/r/62098/diff/1/?file=1815918#file1815918line256>
> >
> >     hmm... is this necessary?
> >     1) how does this interact with Hive duplicate file detection that can 
> > potentially happen later?
> >     2) on the cloud, renaming files is very slow (one of the main reasons 
> > for MM tables). We should not rename unless it's really needed.
> 
> Prasanth_J wrote:
>     Yes. This is necessary. If there are no incompatible files then this will 
> essentially be directory rename. If there are incompatible files and if the 
> filename does not match Hive's convention, then this has to go through 
> file-by-file renaming to staging directory. On the cloud (or on-prem), this 
> will only affect users who already have managed or unmanaged hive tables with 
> files externally loaded (via copy or load data). In which case, renaming has 
> to be done to avoid data loss (more info in the jira). Also this patch 
> disables concatenation on external/unmanaged tables so it won't affect users 
> with this patch. 
>     
>     Hive's duplicate file detection, does not delete files if it has _copy_ 
> suffix (which this patch introduces on conflicts).

MM tables don't have staging directories or any renames.
As for the non-MM case, isn't it possible to just leave these files alone and 
not merge them?


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62098/#review184607
-----------------------------------------------------------


On Sept. 5, 2017, 9:44 p.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62098/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2017, 9:44 p.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-17403
>     https://issues.apache.org/jira/browse/HIVE-17403
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-17403: Fail concatenation for unmanaged and transactional tables
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 
> b3ef9169c25c36c3d6c845f5000874fa78e51f82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> dfad6c192947c9ac80a1bbd86665f46aab128453 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> aca99f2d833822a44f373ded4257af3589707baa 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 
> feacdd8b605eb75166155fa2e7a1692ad4d52bd0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 230ca47e4a667b297cf2a6fef90dd18cf3d1a1c3 
>   ql/src/test/queries/clientnegative/merge_negative_4.q PRE-CREATION 
>   ql/src/test/queries/clientnegative/merge_negative_5.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/orc_merge13.q PRE-CREATION 
>   ql/src/test/results/clientnegative/merge_negative_3.q.out 
> 906336d4d3ea77ca3174a58fad05668d569f7492 
>   ql/src/test/results/clientnegative/merge_negative_4.q.out PRE-CREATION 
>   ql/src/test/results/clientnegative/merge_negative_5.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/orc_merge13.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62098/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>

Reply via email to