[ https://issues.apache.org/jira/browse/HIVE-17012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203527#comment-16203527 ]
Rajesh Balamohan commented on HIVE-17012: ----------------------------------------- {{SemanticAnalyzer.genFileSinkPlan --> genBucketingSortingDest --> genReduceSinkPlan}} is setting to 2 reducers. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L6704 https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L6714 Looking at this code path, it does not look like this is specific to ACID. > ACID Table: Number of reduce tasks should be computed correctly when > sort.dynamic.partition is enabled > ------------------------------------------------------------------------------------------------------ > > Key: HIVE-17012 > URL: https://issues.apache.org/jira/browse/HIVE-17012 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 3.0.0 > Reporter: Rajesh Balamohan > Labels: performance > Attachments: plan.txt > > > {code} > Map 1: 446/446 Reducer 2: 2/2 Reducer 3: 2/2 > ---------------------------------------------------------------------------------------------- > Compile Query 0.24s > Prepare Plan 0.35s > Submit Plan 0.18s > Start DAG 0.21s > Run DAG 32332.27s > ---------------------------------------------------------------------------------------------- > Task Execution Summary > ---------------------------------------------------------------------------------------------- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > ---------------------------------------------------------------------------------------------- > Map 1 1390343.00 0 0 2,879,987,999 > 2,879,987,999 > Reducer 2 31281225.00 0 0 2,750,387,156 > 0 > Reducer 3 751498.00 0 0 129,600,843 > 0 > ---------------------------------------------------------------------------------------------- > {code} > Time taken: 32438.42 seconds to insert <3B rows with > {code} > create table store_sales > ( > ss_sold_time_sk bigint, > ss_item_sk bigint, > ss_customer_sk bigint, > ss_cdemo_sk bigint, > ss_hdemo_sk bigint, > ss_addr_sk bigint, > ss_store_sk bigint, > ss_promo_sk bigint, > ss_ticket_number bigint, > ss_quantity int, > ss_wholesale_cost double, > ss_list_price double, > ss_sales_price double, > ss_ext_discount_amt double, > ss_ext_sales_price double, > ss_ext_wholesale_cost double, > ss_ext_list_price double, > ss_ext_tax double, > ss_coupon_amt double, > ss_net_paid double, > ss_net_paid_inc_tax double, > ss_net_profit double > ) > partitioned by (ss_sold_date_sk bigint) > CLUSTERED BY (ss_ticket_number) INTO 2 BUCKETS > STORED AS ORC > TBLPROPERTIES ('transactional'='true', 'transactional_properties'='default') > ; > from tpcds_text_1000.store_sales ss > insert into table store_sales partition (ss_sold_date_sk) > select > ss.ss_sold_time_sk, > ss.ss_item_sk, > ss.ss_customer_sk, > ss.ss_cdemo_sk, > ss.ss_hdemo_sk, > ss.ss_addr_sk, > ss.ss_store_sk, > ss.ss_promo_sk, > ss.ss_ticket_number, > ss.ss_quantity, > ss.ss_wholesale_cost, > ss.ss_list_price, > ss.ss_sales_price, > ss.ss_ext_discount_amt, > ss.ss_ext_sales_price, > ss.ss_ext_wholesale_cost, > ss.ss_ext_list_price, > ss.ss_ext_tax, > ss.ss_coupon_amt, > ss.ss_net_paid, > ss.ss_net_paid_inc_tax, > ss.ss_net_profit, > ss.ss_sold_date_sk > where ss.ss_sold_date_sk is not null > insert into table store_sales partition (ss_sold_date_sk) > select > ss.ss_sold_time_sk, > ss.ss_item_sk, > ss.ss_customer_sk, > ss.ss_cdemo_sk, > ss.ss_hdemo_sk, > ss.ss_addr_sk, > ss.ss_store_sk, > ss.ss_promo_sk, > ss.ss_ticket_number, > ss.ss_quantity, > ss.ss_wholesale_cost, > ss.ss_list_price, > ss.ss_sales_price, > ss.ss_ext_discount_amt, > ss.ss_ext_sales_price, > ss.ss_ext_wholesale_cost, > ss.ss_ext_list_price, > ss.ss_ext_tax, > ss.ss_coupon_amt, > ss.ss_net_paid, > ss.ss_net_paid_inc_tax, > ss.ss_net_profit, > ss.ss_sold_date_sk > where ss.ss_sold_date_sk is null > ; > {code} > -- This message was sent by Atlassian JIRA (v6.4.14#64029)