----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71995/#review219355 -----------------------------------------------------------
Ship it! Ship It! - Krisztian Kasa On Jan. 22, 2020, 12:09 p.m., Attila Magyar wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71995/ > ----------------------------------------------------------- > > (Updated Jan. 22, 2020, 12:09 p.m.) > > > Review request for hive, Gopal V, Jesús Camacho Rodríguez, and Krisztian Kasa. > > > Bugs: HIVE-22726 > https://issues.apache.org/jira/browse/HIVE-22726 > > > Repository: hive-git > > > Description > ------- > > The TopN key optimizer currently uses a priority queue for keeping track of > the largest/smallest rows. Its max size is the same as the user specified > limit. This should be replaced a more cache line friendly array with a small > (128) maximum size and see how much performance is gained. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b79515fcf07 > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java 4998766f064 > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java b7c12502204 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java > 5faa038c18d > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/wrapper/VectorHashKeyWrapperBatch.java > 0786c82b7be > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/wrapper/VectorHashKeyWrapperGeneralComparator.java > 8cb48473785 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java > ce6efa49192 > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java ff815434f0c > > > Diff: https://reviews.apache.org/r/71995/diff/2/ > > > Testing > ------- > > with the following query: > > > use tpcds_bin_partitioned_orc_100; > set hive.optimize.topnkey=true; > set hive.optimize.topnkey.max=5; > > select i_item_id, > s_state, grouping(s_state) g_state, > avg(ss_quantity) agg1, > avg(ss_list_price) agg2, > avg(ss_coupon_amt) agg3, > avg(ss_sales_price) agg4 > from store_sales, customer_demographics, date_dim, store, item > where ss_sold_date_sk = d_date_sk and > ss_item_sk = i_item_sk and > ss_store_sk = s_store_sk and > ss_cdemo_sk = cd_demo_sk > group by rollup (i_item_id, s_state) > order by i_item_id > ,s_state > limit 5; > > > Results: > enabled: 5 rows selected (715.26 seconds) > enabled: 5 rows selected (605.888 seconds) > disabled: 5 rows selected (1208.168 seconds) > disabled: 5 rows selected (1219.482 seconds) > > > Thanks, > > Attila Magyar > >