> On Sept. 6, 2017, 12:08 a.m., Sergey Shelukhin wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > > Line 231 (original), 256 (patched) > > <https://reviews.apache.org/r/62098/diff/1/?file=1815918#file1815918line256> > > > > hmm... is this necessary? > > 1) how does this interact with Hive duplicate file detection that can > > potentially happen later? > > 2) on the cloud, renaming files is very slow (one of the main reasons > > for MM tables). We should not rename unless it's really needed. > > Prasanth_J wrote: > Yes. This is necessary. If there are no incompatible files then this will > essentially be directory rename. If there are incompatible files and if the > filename does not match Hive's convention, then this has to go through > file-by-file renaming to staging directory. On the cloud (or on-prem), this > will only affect users who already have managed or unmanaged hive tables with > files externally loaded (via copy or load data). In which case, renaming has > to be done to avoid data loss (more info in the jira). Also this patch > disables concatenation on external/unmanaged tables so it won't affect users > with this patch. > > Hive's duplicate file detection, does not delete files if it has _copy_ > suffix (which this patch introduces on conflicts).
MM tables don't have staging directories or any renames. As for the non-MM case, isn't it possible to just leave these files alone and not merge them? - Sergey ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62098/#review184607 ----------------------------------------------------------- On Sept. 5, 2017, 9:44 p.m., Prasanth_J wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62098/ > ----------------------------------------------------------- > > (Updated Sept. 5, 2017, 9:44 p.m.) > > > Review request for hive and Sergey Shelukhin. > > > Bugs: HIVE-17403 > https://issues.apache.org/jira/browse/HIVE-17403 > > > Repository: hive-git > > > Description > ------- > > HIVE-17403: Fail concatenation for unmanaged and transactional tables > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java > b3ef9169c25c36c3d6c845f5000874fa78e51f82 > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > dfad6c192947c9ac80a1bbd86665f46aab128453 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > aca99f2d833822a44f373ded4257af3589707baa > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java > feacdd8b605eb75166155fa2e7a1692ad4d52bd0 > ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java > 230ca47e4a667b297cf2a6fef90dd18cf3d1a1c3 > ql/src/test/queries/clientnegative/merge_negative_4.q PRE-CREATION > ql/src/test/queries/clientnegative/merge_negative_5.q PRE-CREATION > ql/src/test/queries/clientpositive/orc_merge13.q PRE-CREATION > ql/src/test/results/clientnegative/merge_negative_3.q.out > 906336d4d3ea77ca3174a58fad05668d569f7492 > ql/src/test/results/clientnegative/merge_negative_4.q.out PRE-CREATION > ql/src/test/results/clientnegative/merge_negative_5.q.out PRE-CREATION > ql/src/test/results/clientpositive/orc_merge13.q.out PRE-CREATION > > > Diff: https://reviews.apache.org/r/62098/diff/1/ > > > Testing > ------- > > > Thanks, > > Prasanth_J > >