[ https://issues.apache.org/jira/browse/HIVE-24409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237154#comment-17237154 ]
Rajesh Balamohan commented on HIVE-24409: ----------------------------------------- {noformat} e.g query @10TB scale insert overwrite table store_Sales_delete_1 partition(ss_sold_date_sk) select * from tpcds_bin_partitioned_orc_10000.store_sales where ss_sold_date_sk>2452400;; With LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc: INFO : Task Execution Summary INFO : ---------------------------------------------------------------------------------------------- INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS INFO : ---------------------------------------------------------------------------------------------- INFO : Map 1 539486.00 0 0 4,428,107,724 4,428,109,001 INFO : Reducer 2 202397.00 0 0 4,428,109,001 0 INFO : Reducer 3 737788.00 0 0 4,428,109,001 0 INFO : ---------------------------------------------------------------------------------------------- Without patch: INFO : Task Execution Summary INFO : ---------------------------------------------------------------------------------------------- INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS INFO : ---------------------------------------------------------------------------------------------- INFO : Map 1 1080431.00 0 0 4,428,107,724 4,428,109,001 INFO : Reducer 2 205654.00 0 0 4,428,109,001 0 INFO : Reducer 3 806846.00 0 0 4,428,109,001 0 INFO : ---------------------------------------------------------------------------------------------- Shows good amount of improvement in Map 1. {noformat} > Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc > ---------------------------------------------------------- > > Key: HIVE-24409 > URL: https://issues.apache.org/jira/browse/HIVE-24409 > Project: Hive > Issue Type: Improvement > Reporter: Rajesh Balamohan > Priority: Major > Attachments: Screenshot 2020-11-23 at 10.52.49 AM.png > > > !Screenshot 2020-11-23 at 10.52.49 AM.png|width=858,height=493! > Lines of interest: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L535] > (non-vectorized path due to stats) > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java#L581] > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)