[ https://issues.apache.org/jira/browse/HIVE-22735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017945#comment-17017945 ]
Hive QA commented on HIVE-22735: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12991140/HIVE-22735.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 17877 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets_limit] (batchId=175) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query27] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query5] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query77] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query80] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query14] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query27] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query5] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query77] (batchId=303) org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query80] (batchId=303) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/20224/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20224/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20224/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12991140 - PreCommit-HIVE-Build > TopNKey operator deduplication > ------------------------------ > > Key: HIVE-22735 > URL: https://issues.apache.org/jira/browse/HIVE-22735 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer > Reporter: Krisztian Kasa > Assignee: Krisztian Kasa > Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-22735.1.patch > > > In some cases more than one TNK operator has the same expressions in the same > operator tree or the difference is only a constant column. Most of this cases > only one TNK op. should remain. > {code} > +----------------------------------------------------+ > | Explain | > +----------------------------------------------------+ > | Plan not optimized by CBO. | > | | > | Vertex dependency in root stage | > | Map 1 <- Reducer 8 (BROADCAST_EDGE) | > | Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Map 6 > (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) | > | Reducer 3 <- Reducer 2 (SIMPLE_EDGE) | > | Reducer 4 <- Reducer 3 (SIMPLE_EDGE) | > | Reducer 8 <- Map 7 (CUSTOM_SIMPLE_EDGE) | > | | > | Stage-0 | > | Fetch Operator | > | limit:50 | > | Stage-1 | > | Reducer 4 vectorized | > | File Output Operator [FS_127] | > | Limit [LIM_126] (rows=50 width=538) | > | Number of rows:50 | > | Select Operator [SEL_125] (rows=190 width=538) | > | > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"] | > | <-Reducer 3 [SIMPLE_EDGE] | > | SHUFFLE [RS_30] | > | Select Operator [SEL_29] (rows=190 width=538) | > | > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"] | > | Group By Operator [GBY_28] (rows=190 width=538) | > | > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"],aggregations:["avg(VALUE._col0)","avg(VALUE._col1)","avg(VALUE._col2)","avg(VALUE._col3)"],keys:KEY._col0, > KEY._col1, KEY._col2 | > | <-Reducer 2 [SIMPLE_EDGE] | > | SHUFFLE [RS_27] | > | PartitionCols:_col0, _col1, _col2 | > | Group By Operator [GBY_26] (rows=190 width=1134) | > | > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"],aggregations:["avg(_col9)","avg(_col11)","avg(_col18)","avg(_col12)"],keys:_col102, > _col93, 0L | > | Top N Key Operator [TNK_60] (rows=127 width=234) | > | keys:_col102, _col93, 0L,top n:50 | > | Select Operator [SEL_25] (rows=127 width=234) | > | > Output:["_col9","_col11","_col12","_col18","_col93","_col102"] | > | Top N Key Operator [TNK_58] (rows=127 width=234) | > | keys:_col102, _col93,top n:50 | > | Filter Operator [FIL_49] (rows=127 width=234) | > | predicate:((_col22 = _col38) and (_col1 = > _col101) and (_col6 = _col69) and (_col3 = _col26)) | > | Map Join Operator [MAPJOIN_102] (rows=2044 > width=232) | > | > Conds:MAPJOIN_101._col1=RS_123.i_item_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26","_col38","_col69","_col93","_col101","_col102"] > | > | <-Map 9 [BROADCAST_EDGE] vectorized | > | BROADCAST [RS_123] | > | PartitionCols:i_item_sk | > | Filter Operator [FIL_122] (rows=204000 > width=108) | > | predicate:i_item_sk is not null | > | TableScan [TS_4] (rows=204000 > width=108) | > | > tpcds_bin_partitioned_orc_100@item,item, ACID > table,Tbl:COMPLETE,Col:COMPLETE,Output:["i_item_sk","i_item_id"] | > | <-Map Join Operator [MAPJOIN_101] (rows=2010 > width=118) | > | > Conds:MAPJOIN_100._col6=RS_107.s_store_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26","_col38","_col69","_col93"] > | > | <-Map 7 [BROADCAST_EDGE] vectorized | > | PARTITION_ONLY_SHUFFLE [RS_107] | > | PartitionCols:s_store_sk | > | Filter Operator [FIL_106] (rows=402 > width=94) | > | predicate:s_store_sk is not null | > | TableScan [TS_3] (rows=402 width=94) | > | > tpcds_bin_partitioned_orc_100@store,store, ACID > table,Tbl:COMPLETE,Col:COMPLETE,Output:["s_store_sk","s_state"] | > | <-Map Join Operator [MAPJOIN_100] > (rows=9604000 width=24) | > | > Conds:MERGEJOIN_99._col22=RS_118.d_date_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26","_col38"] > | > | <-Map 6 [BROADCAST_EDGE] vectorized | > | BROADCAST [RS_118] | > | PartitionCols:d_date_sk | > | Filter Operator [FIL_117] (rows=73049 > width=8) | > | predicate:d_date_sk is not null | > | TableScan [TS_2] (rows=73049 > width=8) | > | > tpcds_bin_partitioned_orc_100@date_dim,date_dim, ACID > table,Tbl:COMPLETE,Col:COMPLETE,Output:["d_date_sk"] | > | Dynamic Partitioning Event Operator > [EVENT_121] (rows=1 width=8) | > | Group By Operator [GBY_120] (rows=1 > width=8) | > | Output:["_col0"],keys:_col0 | > | Select Operator [SEL_119] > (rows=73049 width=8) | > | Output:["_col0"] | > | Please refer to the previous > Filter Operator [FIL_117] | > | <-Merge Join Operator [MERGEJOIN_99] > (rows=9604000 width=16) | > | > Conds:RS_114.ss_cdemo_sk=RS_116.cd_demo_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26"] > | > | <-Map 1 [SIMPLE_EDGE] vectorized | > | SHUFFLE [RS_114] | > | PartitionCols:ss_cdemo_sk | > | Filter Operator [FIL_113] > (rows=235814137 width=353) | > | predicate:(ss_cdemo_sk is not > null and ss_store_sk is not null and ss_item_sk is not null and ss_store_sk > BETWEEN DynamicValue(RS_17_store_s_store_sk_min) AND > DynamicValue(RS_17_store_s_store_sk_max) and in_bloom_filter(ss_store_sk, > DynamicValue(RS_17_store_s_store_sk_bloom_filter))) | > | TableScan [TS_0] (rows=275041999 > width=723) | > | > tpcds_bin_partitioned_orc_100@store_sales,store_sales, ACID > table,Tbl:COMPLETE,Col:PARTIAL,Output:["ss_item_sk","ss_cdemo_sk","ss_store_sk","ss_quantity","ss_list_price","ss_sales_price","ss_coupon_amt"] > | > | <-Reducer 8 [BROADCAST_EDGE] > vectorized | > | BROADCAST [RS_112] | > | Group By Operator [GBY_111] > (rows=1 width=24) | > | > Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, > expectedEntries=1000000)"] | > | <-Map 5 [SIMPLE_EDGE] vectorized | > | SHUFFLE [RS_116] | > | PartitionCols:cd_demo_sk | > | Filter Operator [FIL_115] > (rows=1920800 width=8) | > | predicate:cd_demo_sk is not null | > | TableScan [TS_1] (rows=1920800 > width=8) | > | > tpcds_bin_partitioned_orc_100@customer_demographics,customer_demographics, > ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["cd_demo_sk"] | > | | > +----------------------------------------------------+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)