> On Sept. 6, 2017, 12:08 a.m., Sergey Shelukhin wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > > Line 231 (original), 256 (patched) > > <https://reviews.apache.org/r/62098/diff/1/?file=1815918#file1815918line256> > > > > hmm... is this necessary? > > 1) how does this interact with Hive duplicate file detection that can > > potentially happen later? > > 2) on the cloud, renaming files is very slow (one of the main reasons > > for MM tables). We should not rename unless it's really needed. > > Prasanth_J wrote: > Yes. This is necessary. If there are no incompatible files then this will > essentially be directory rename. If there are incompatible files and if the > filename does not match Hive's convention, then this has to go through > file-by-file renaming to staging directory. On the cloud (or on-prem), this > will only affect users who already have managed or unmanaged hive tables with > files externally loaded (via copy or load data). In which case, renaming has > to be done to avoid data loss (more info in the jira). Also this patch > disables concatenation on external/unmanaged tables so it won't affect users > with this patch. > > Hive's duplicate file detection, does not delete files if it has _copy_ > suffix (which this patch introduces on conflicts). > > Sergey Shelukhin wrote: > MM tables don't have staging directories or any renames. > As for the non-MM case, isn't it possible to just leave these files alone > and not merge them?
I don't know how exactly MM tables are implemented. As long the tests pass I think it will work :) How does orc_merge* tests work in MM tables anyway? For non-MM case, the problem is concatenation reads from and writes to the same directory (src and dest are same). This uses the same codepath as with every other code (initially from RC file concatenate) in which we read from src -> write merged file to staging tmp dir -> move task will rename staging tmp dir to dest (which is same as src). So leaving the files alone is a problem as it will be lost during final rename. So incompat files also gets moved to staging tmp dir so that nothing gets dropped by the move task. - Prasanth_J ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62098/#review184607 ----------------------------------------------------------- On Sept. 5, 2017, 9:44 p.m., Prasanth_J wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62098/ > ----------------------------------------------------------- > > (Updated Sept. 5, 2017, 9:44 p.m.) > > > Review request for hive and Sergey Shelukhin. > > > Bugs: HIVE-17403 > https://issues.apache.org/jira/browse/HIVE-17403 > > > Repository: hive-git > > > Description > ------- > > HIVE-17403: Fail concatenation for unmanaged and transactional tables > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java > b3ef9169c25c36c3d6c845f5000874fa78e51f82 > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > dfad6c192947c9ac80a1bbd86665f46aab128453 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > aca99f2d833822a44f373ded4257af3589707baa > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java > feacdd8b605eb75166155fa2e7a1692ad4d52bd0 > ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java > 230ca47e4a667b297cf2a6fef90dd18cf3d1a1c3 > ql/src/test/queries/clientnegative/merge_negative_4.q PRE-CREATION > ql/src/test/queries/clientnegative/merge_negative_5.q PRE-CREATION > ql/src/test/queries/clientpositive/orc_merge13.q PRE-CREATION > ql/src/test/results/clientnegative/merge_negative_3.q.out > 906336d4d3ea77ca3174a58fad05668d569f7492 > ql/src/test/results/clientnegative/merge_negative_4.q.out PRE-CREATION > ql/src/test/results/clientnegative/merge_negative_5.q.out PRE-CREATION > ql/src/test/results/clientpositive/orc_merge13.q.out PRE-CREATION > > > Diff: https://reviews.apache.org/r/62098/diff/1/ > > > Testing > ------- > > > Thanks, > > Prasanth_J > >