[ https://issues.apache.org/jira/browse/HIVE-22707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17011303#comment-17011303 ]
Hive QA commented on HIVE-22707: -------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 50s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 1 new + 39 unchanged - 5 fixed = 40 total (was 44) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-20122/dev-support/hive-personality.sh | | git revision | master / 706c1d4 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-20122/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-20122/yetus.txt | | Powered by | Apache Yetus http://yetus.apache.org | This message was automatically generated. > MergeJoinWork should be considered while collecting DAG credentials > ------------------------------------------------------------------- > > Key: HIVE-22707 > URL: https://issues.apache.org/jira/browse/HIVE-22707 > Project: Hive > Issue Type: Bug > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Attachments: HIVE-22707.01.patch, HIVE-22707.02.patch > > > Given a scenario, when there are 2 different buckets, and the output is > written to another bucket than the source. Under specific circumstances, > FileSinkOperator is only used in Reducer stages, and if a root work in that > stage is a merge join work, it's not scanned for output uris/paths, therefore > needed delegation tokens are not fetched for e.g. the output s3 bucket. > https://github.com/apache/hive/blob/0df4f6c61010b64246d4790f9ce14e966ef34dcb/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L1507-L1514 > {code} > public void addCredentials(BaseWork work, DAG dag) throws IOException { > > dag.getCredentials().mergeAll(UserGroupInformation.getCurrentUser().getCredentials()); > if (work instanceof MapWork) { > addCredentials((MapWork) work, dag); > } else if (work instanceof ReduceWork) { > addCredentials((ReduceWork) work, dag); > } > } > {code} > sample plan, note Merge Join Operator [MERGEJOIN_35] > {code} > +----------------------------------------------------+ > | Explain | > +----------------------------------------------------+ > | Plan optimized by CBO. | > | | > | Vertex dependency in root stage | > | Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) | > | Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE) | > | | > | Stage-3 | > | Stats Work{} | > | Stage-4 | > | Create > Table{"name:":"tpcds_bin_partitioned_orc_1000.catalog_sales_out"} | > | Stage-0 | > | Move Operator | > | Stage-1 | > | Reducer 3 | > | File Output Operator [FS_20] | > | Group By Operator [GBY_18] (rows=1 width=440) | > | > Output:["_col0"],aggregations:["compute_stats(VALUE._col0)"] | > | <-Reducer 2 [CUSTOM_SIMPLE_EDGE] | > | File Output Operator [FS_10] | > | > table:{"name:":"tpcds_bin_partitioned_orc_1000.catalog_sales_out"} | > | Select Operator [SEL_9] (rows=8400 width=7) | > | Output:["_col0"] | > | Merge Join Operator [MERGEJOIN_35] (rows=8400 > width=7) | > | > Conds:RS_38._col1=RS_41._col0(Inner),Output:["_col1"] | > | <-Map 1 [SIMPLE_EDGE] vectorized | > | SHUFFLE [RS_38] | > | PartitionCols:_col1 | > | Select Operator [SEL_37] (rows=16799 width=15) | > | Output:["_col1"] | > | Filter Operator [FIL_36] (rows=16799 width=15) | > | predicate:((cs_sold_time_sk = 74858L) and > cs_call_center_sk is not null) | > | TableScan [TS_0] (rows=1439980416 width=15) | > | > tpcds_bin_partitioned_orc_1000@catalog_sales,cs, ACID > table,Tbl:COMPLETE,Col:PARTIAL,Output:["cs_sold_time_sk","cs_call_center_sk"] > | > | <-Map 4 [SIMPLE_EDGE] vectorized | > | SHUFFLE [RS_41] | > | PartitionCols:_col0 | > | Select Operator [SEL_40] (rows=21 width=107) | > | Output:["_col0"] | > | Filter Operator [FIL_39] (rows=21 width=107) | > | predicate:((CAST( cc_county AS STRING) = > 'Williamson County') and cc_call_center_sk is not null) | > | TableScan [TS_3] (rows=42 width=107) | > | > tpcds_bin_partitioned_orc_1000@call_center,cc, ACID > table,Tbl:COMPLETE,Col:COMPLETE,Output:["cc_call_center_sk","cc_county"] | > | PARTITION_ONLY_SHUFFLE [RS_17] | > | Group By Operator [GBY_16] (rows=1 width=424) | > | Output:["_col0"],aggregations:["compute_stats(col1, > 'hll')"] | > | Select Operator [SEL_15] (rows=8400 width=7) | > | Output:["col1"] | > | Please refer to the previous Select Operator > [SEL_9] | > | Stage-2 | > | Dependency Collection{} | > | Please refer to the previous Stage-1 | > | | > +----------------------------------------------------+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)