[ https://issues.apache.org/jira/browse/HIVE-9368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mostafa Mokhtar resolved HIVE-9368. ----------------------------------- Resolution: Done Assignee: Mostafa Mokhtar (was: Vikram Dixit K) > Physical optimizer : Join order in Explain is different from join order > provided by Calcite > ------------------------------------------------------------------------------------------- > > Key: HIVE-9368 > URL: https://issues.apache.org/jira/browse/HIVE-9368 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer > Affects Versions: 0.14.0 > Reporter: Mostafa Mokhtar > Assignee: Mostafa Mokhtar > Fix For: 1.2.0 > > Attachments: explain_fetch_column_stats_off.txt, > explain_fetch_column_stats_on.txt > > > Join order in explain is different from that provided by Calcite, this was > observed during the POC. > Logical plan from Calcite : > {code} > 2015-01-13 18:54:42,892 DEBUG [main]: parse.CalcitePlanner > (CalcitePlanner.java:apply(743)) - Plan After Join Reordering: > HiveProject(scale=[$0], time_key_num=[$1], dataset_code=[$2], > cost_center_lvl1_id=[$3], cost_pool_lvl6_id=[$4], lvl5_id=[$5], > view_lvl1_id=[$6], from_lvl1_id=[$7], plan_id=[$8], client_id=[$9], > lob_id=[$10], product_id=[$11], fprs_lvl5_id=[$12], ssn_id=[$13], > account_id=[$14], mtd_balance=[$15]): rowcount = 2.53152774E8, cumulative > cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 636 > HiveAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}], > agg#0=[SUM($15)]): rowcount = 2.53152774E8, cumulative cost = > {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 634 > HiveProject($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$3], $f4=[$4], $f5=[$24], > $f6=[$6], $f7=[$7], $f8=[$8], $f9=[$9], $f10=[$10], $f11=[$11], $f12=[$21], > $f13=[$18], $f14=[$19], $f15=[*($13, $20)]): rowcount = 3.401053197411791E11, > cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 632 > HiveProject(scale=[$7], time_key_num=[$8], dataset_code=[$9], > cost_center_lvl1_id=[$10], cost_pool_lvl6_id=[$11], activity_id=[$12], > view_lvl1_id=[$13], from_lvl1_id=[$14], plan_id=[$15], client_id=[$16], > lob_id=[$17], product_id=[$18], fprs_id=[$19], mtd_balance=[$20], > time_key_num0=[$0], activity_id0=[$1], plan_id0=[$2], fprs_id0=[$3], > ssn_id=[$4], account_id=[$5], driver_pct=[$6], lvl5_id=[$25], > current_ind=[$26], fprs_id1=[$27], lvl5_id0=[$21], rollup_key=[$22], > current_ind0=[$23], activity_id1=[$24]): rowcount = 3.401053197411791E11, > cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 692 > HiveJoin(condition=[AND(AND(AND(=($8, $0), =($15, $2)), =($19, $3)), > =($12, $1))], joinType=[inner]): rowcount = 3.401053197411791E11, cumulative > cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 690 > HiveProject(time_key_num=[$0], activity_id=[$1], plan_id=[$2], > fprs_id=[$3], ssn_id=[$4], account_id=[$5], driver_pct=[$6]): rowcount = > 2.926396239E9, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 590 > > HiveTableScan(table=[[fidelity.fcap_drivers_part_exp_inter_bucket_256]]): > rowcount = 2.926396239E9, cumulative cost = {0}, id = 465 > HiveJoin(condition=[=($12, $20)], joinType=[inner]): rowcount = > 1.0871372980143067E8, cumulative cost = {2.2067125966323376E7 rows, 0.0 cpu, > 0.0 io}, id = 688 > HiveJoin(condition=[=($5, $17)], joinType=[inner]): rowcount = > 1.4392118216323378E7, cumulative cost = {6880237.75 rows, 0.0 cpu, 0.0 io}, > id = 653 > HiveProject(scale=[$0], time_key_num=[$1], dataset_code=[$2], > cost_center_lvl1_id=[$3], cost_pool_lvl6_id=[$4], activity_id=[$5], > view_lvl1_id=[$6], from_lvl1_id=[$7], plan_id=[$8], client_id=[$9], > lob_id=[$10], product_id=[$11], fprs_id=[$12], mtd_balance=[$14]): rowcount = > 6870067.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 587 > > HiveTableScan(table=[[fidelity.fcap_agg_prod_exp_nofund_decimal]]): rowcount > = 6870067.0, cumulative cost = {0}, id = 464 > HiveProject(lvl5_id=[$36], rollup_key=[$48], current_ind=[$51], > activity_id=[$60]): rowcount = 10170.75, cumulative cost = {0.0 rows, 0.0 > cpu, 0.0 io}, id = 628 > HiveFilter(condition=[AND(=($51, 'Y'), =($48, 'TOTACT'))]): > rowcount = 10170.75, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 626 > HiveTableScan(table=[[fidelity.fobi_activity_dim_mv]]): > rowcount = 40683.0, cumulative cost = {0}, id = 467 > HiveProject(lvl5_id=[$36], current_ind=[$51], fprs_id=[$58]): > rowcount = 794770.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 622 > HiveFilter(condition=[=($51, 'Y')]): rowcount = 794770.0, > cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 620 > HiveTableScan(table=[[fidelity.fobi_fprs_dim_mv_orc]]): > rowcount = 1589540.0, cumulative cost = {0}, id = 466 > {code} > Plan #1 with Fetch column stats on > {code} > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-2 depends on stages: Stage-1 > Stage-0 depends on stages: Stage-2 > Stage-3 depends on stages: Stage-0 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 4 <- Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > DagName: mmokhtar_20150113185454_d7ce6ecf-2d50-45ed-8a88-6283bb091b0e:3 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: driver > filterExpr: (((time_key_num is not null and plan_id is not > null) and fprs_id is not null) and activity_id is not null) (type: boolean) > Statistics: Num rows: 2926396239 Data size: 468223398240 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((time_key_num is not null and plan_id is not > null) and fprs_id is not null) and activity_id is not null) (type: boolean) > Statistics: Num rows: 2926396239 Data size: 468223398240 > Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: time_key_num (type: bigint), activity_id > (type: bigint), plan_id (type: bigint), fprs_id (type: bigint), ssn_id (type: > bigint), account_id (type: bigint), driver_pct (type: decimal(28,12)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6 > Statistics: Num rows: 2926396239 Data size: > 468223398240 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: bigint), _col2 (type: > bigint), _col3 (type: bigint), _col1 (type: bigint) > sort order: ++++ > Map-reduce partition columns: _col0 (type: bigint), > _col2 (type: bigint), _col3 (type: bigint), _col1 (type: bigint) > Statistics: Num rows: 2926396239 Data size: > 468223398240 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col4 (type: bigint), _col5 (type: > bigint), _col6 (type: decimal(28,12)) > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: balance > filterExpr: (((activity_id is not null and fprs_id is not > null) and time_key_num is not null) and plan_id is not null) (type: boolean) > Statistics: Num rows: 6870067 Data size: 2102240502 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((activity_id is not null and fprs_id is not > null) and time_key_num is not null) and plan_id is not null) (type: boolean) > Statistics: Num rows: 6870067 Data size: 1483934472 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: scale (type: bigint), time_key_num (type: > bigint), dataset_code (type: bigint), cost_center_lvl1_id (type: bigint), > cost_pool_lvl6_id (type: bigint), activity_id (type: bigint), view_lvl1_id > (type: bigint), from_lvl1_id (type: bigint), plan_id (type: bigint), > client_id (type: bigint), lob_id (type: bigint), product_id (type: bigint), > fprs_id (type: bigint), mtd_balance (type: decimal(28,12)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 > Statistics: Num rows: 6870067 Data size: 1483934472 > Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col5 (type: bigint) > 1 _col3 (type: bigint) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 > input vertices: > 1 Map 5 > Statistics: Num rows: 57555202 Data size: 17151450196 > Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col12 (type: bigint) > 1 _col2 (type: bigint) > outputColumnNames: _col0, _col1, _col2, _col3, > _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, > _col14, _col18 > input vertices: > 1 Map 6 > Statistics: Num rows: 869509350 Data size: > 266069861100 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: bigint), _col1 (type: > bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), > _col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint), > _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 > (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: > bigint), _col9 (type: bigint) > outputColumnNames: _col0, _col1, _col10, _col11, > _col12, _col13, _col14, _col18, _col2, _col3, _col4, _col5, _col6, _col7, > _col8, _col9 > Statistics: Num rows: 869509350 Data size: > 266069861100 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col1 (type: bigint), _col8 > (type: bigint), _col12 (type: bigint), _col5 (type: bigint) > sort order: ++++ > Map-reduce partition columns: _col1 (type: > bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) > Statistics: Num rows: 869509350 Data size: > 266069861100 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: bigint), _col2 > (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col6 (type: > bigint), _col7 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), > _col11 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), > _col18 (type: bigint) > Execution mode: vectorized > Map 5 > Map Operator Tree: > TableScan > alias: act > filterExpr: (((current_ind = 'Y') and (rollup_key = > 'TOTACT')) and activity_id is not null) (type: boolean) > Statistics: Num rows: 40683 Data size: 271025472 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((current_ind = 'Y') and (rollup_key = > 'TOTACT')) and activity_id is not null) (type: boolean) > Statistics: Num rows: 20341 Data size: 5553093 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: lvl5_id (type: string), activity_id (type: > bigint) > outputColumnNames: _col0, _col3 > Statistics: Num rows: 20341 Data size: 1993418 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col3 (type: bigint) > sort order: + > Map-reduce partition columns: _col3 (type: bigint) > Statistics: Num rows: 20341 Data size: 1993418 Basic > stats: COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: string) > Execution mode: vectorized > Map 6 > Map Operator Tree: > TableScan > alias: fprs > filterExpr: ((current_ind = 'Y') and fprs_id is not null) > (type: boolean) > Statistics: Num rows: 1589540 Data size: 6862044180 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((current_ind = 'Y') and fprs_id is not null) > (type: boolean) > Statistics: Num rows: 794770 Data size: 80271770 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: lvl5_id (type: bigint), fprs_id (type: > bigint) > outputColumnNames: _col0, _col2 > Statistics: Num rows: 794770 Data size: 12716320 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col2 (type: bigint) > sort order: + > Map-reduce partition columns: _col2 (type: bigint) > Statistics: Num rows: 794770 Data size: 12716320 > Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: bigint) > Execution mode: vectorized > Reducer 2 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: bigint), _col2 (type: bigint), _col3 (type: > bigint), _col1 (type: bigint) > 1 _col1 (type: bigint), _col8 (type: bigint), _col12 (type: > bigint), _col5 (type: bigint) > outputColumnNames: _col4, _col5, _col6, _col7, _col8, _col9, > _col10, _col11, _col13, _col14, _col15, _col16, _col17, _col18, _col20, > _col21, _col25 > Statistics: Num rows: 28151027141 Data size: 8389006088018 > Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col7 (type: bigint), _col8 (type: bigint), > _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col21 > (type: string), _col13 (type: bigint), _col14 (type: bigint), _col15 (type: > bigint), _col16 (type: bigint), _col17 (type: bigint), _col18 (type: bigint), > _col25 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), (_col20 * > _col6) (type: decimal(38,24)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, > _col15 > Statistics: Num rows: 28151027141 Data size: 8389006088018 > Basic stats: COMPLETE Column stats: COMPLETE > Group By Operator > aggregations: sum(_col15) > keys: _col0 (type: bigint), _col1 (type: bigint), _col2 > (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: > string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), > _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 > (type: bigint), _col13 (type: bigint), _col14 (type: bigint) > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, > _col15 > Statistics: Num rows: 28151027141 Data size: > 8389006088018 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: bigint), _col1 (type: > bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), > _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 > (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: > bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) > sort order: +++++++++++++++ > Map-reduce partition columns: _col0 (type: bigint), > _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 > (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: > bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), > _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 > (type: bigint) > Statistics: Num rows: 28151027141 Data size: > 8389006088018 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col15 (type: decimal(38,24)) > Reducer 3 > Reduce Operator Tree: > Group By Operator > aggregations: sum(VALUE._col0) > keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint), > KEY._col2 (type: bigint), KEY._col3 (type: bigint), KEY._col4 (type: bigint), > KEY._col5 (type: string), KEY._col6 (type: bigint), KEY._col7 (type: bigint), > KEY._col8 (type: bigint), KEY._col9 (type: bigint), KEY._col10 (type: > bigint), KEY._col11 (type: bigint), KEY._col12 (type: bigint), KEY._col13 > (type: bigint), KEY._col14 (type: bigint) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, > _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 > Statistics: Num rows: 28151027141 Data size: 8389006088018 > Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: bigint), _col1 (type: bigint), > _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), > UDFToLong(_col5) (type: bigint), _col6 (type: bigint), _col7 (type: bigint), > _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 > (type: bigint), _col12 (type: bigint), _col13 (type: bigint), > UDFToString(_col14) (type: string), CAST( _col15 AS decimal(28,12)) (type: > decimal(28,12)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, > _col15 > Statistics: Num rows: 28151027141 Data size: 11260410856400 > Basic stats: COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 28151027141 Data size: > 11260410856400 Basic stats: COMPLETE Column stats: COMPLETE > table: > input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: fidelity.fcap_agg_part_exp_hive_decimal_decimal > Stage: Stage-2 > Dependency Collection > Stage: Stage-0 > Move Operator > tables: > replace: false > table: > input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: fidelity.fcap_agg_part_exp_hive_decimal_decimal > Stage: Stage-3 > Stats-Aggr Operator > {code} > Plan #2 with fetch column stats off > {code}STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-2 depends on stages: Stage-1 > Stage-0 depends on stages: Stage-2 > Stage-3 depends on stages: Stage-0 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 1 <- Map 5 (BROADCAST_EDGE) > Map 3 <- Map 4 (BROADCAST_EDGE) > Map 5 <- Map 3 (BROADCAST_EDGE) > Reducer 2 <- Map 1 (SIMPLE_EDGE) > DagName: mmokhtar_20150113185454_7c350b7b-53fa-4bbb-a747-24ec0c2a6ab7:4 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: driver > filterExpr: (((time_key_num is not null and plan_id is not > null) and fprs_id is not null) and activity_id is not null) (type: boolean) > Statistics: Num rows: 2926396239 Data size: 468223398240 > Basic stats: COMPLETE Column stats: NONE > Filter Operator > predicate: (((time_key_num is not null and plan_id is not > null) and fprs_id is not null) and activity_id is not null) (type: boolean) > Statistics: Num rows: 182899765 Data size: 29263962400 > Basic stats: COMPLETE Column stats: NONE > Select Operator > expressions: time_key_num (type: bigint), activity_id > (type: bigint), plan_id (type: bigint), fprs_id (type: bigint), ssn_id (type: > bigint), account_id (type: bigint), driver_pct (type: decimal(28,12)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6 > Statistics: Num rows: 182899765 Data size: 29263962400 > Basic stats: COMPLETE Column stats: NONE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: bigint), _col2 (type: bigint), _col3 > (type: bigint), _col1 (type: bigint) > 1 _col1 (type: bigint), _col8 (type: bigint), > _col12 (type: bigint), _col5 (type: bigint) > outputColumnNames: _col4, _col5, _col6, _col7, _col8, > _col9, _col10, _col11, _col13, _col14, _col15, _col16, _col17, _col18, > _col20, _col21, _col25 > input vertices: > 1 Map 5 > Statistics: Num rows: 201189745 Data size: > 32190359337 Basic stats: COMPLETE Column stats: NONE > Select Operator > expressions: _col7 (type: bigint), _col8 (type: > bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), > _col21 (type: string), _col13 (type: bigint), _col14 (type: bigint), _col15 > (type: bigint), _col16 (type: bigint), _col17 (type: bigint), _col18 (type: > bigint), _col25 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), > (_col20 * _col6) (type: decimal(38,24)) > outputColumnNames: _col0, _col1, _col2, _col3, > _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, > _col14, _col15 > Statistics: Num rows: 201189745 Data size: > 32190359337 Basic stats: COMPLETE Column stats: NONE > Group By Operator > aggregations: sum(_col15) > keys: _col0 (type: bigint), _col1 (type: bigint), > _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 > (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: > bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), > _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3, > _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, > _col14, _col15 > Statistics: Num rows: 201189745 Data size: > 32190359337 Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: bigint), _col1 > (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: > bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), > _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 > (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: > bigint) > sort order: +++++++++++++++ > Map-reduce partition columns: _col0 (type: > bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), > _col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 > (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: > bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), > _col14 (type: bigint) > Statistics: Num rows: 201189745 Data size: > 32190359337 Basic stats: COMPLETE Column stats: NONE > value expressions: _col15 (type: decimal(38,24)) > Execution mode: vectorized > Map 3 > Map Operator Tree: > TableScan > alias: balance > filterExpr: (((activity_id is not null and fprs_id is not > null) and time_key_num is not null) and plan_id is not null) (type: boolean) > Statistics: Num rows: 6870067 Data size: 2102240502 Basic > stats: COMPLETE Column stats: NONE > Filter Operator > predicate: (((activity_id is not null and fprs_id is not > null) and time_key_num is not null) and plan_id is not null) (type: boolean) > Statistics: Num rows: 429380 Data size: 131390279 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: scale (type: bigint), time_key_num (type: > bigint), dataset_code (type: bigint), cost_center_lvl1_id (type: bigint), > cost_pool_lvl6_id (type: bigint), activity_id (type: bigint), view_lvl1_id > (type: bigint), from_lvl1_id (type: bigint), plan_id (type: bigint), > client_id (type: bigint), lob_id (type: bigint), product_id (type: bigint), > fprs_id (type: bigint), mtd_balance (type: decimal(28,12)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 > Statistics: Num rows: 429380 Data size: 131390279 Basic > stats: COMPLETE Column stats: NONE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col5 (type: bigint) > 1 _col3 (type: bigint) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 > input vertices: > 1 Map 4 > Statistics: Num rows: 472318 Data size: 144529310 > Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col12 (type: bigint) > sort order: + > Map-reduce partition columns: _col12 (type: bigint) > Statistics: Num rows: 472318 Data size: 144529310 > Basic stats: COMPLETE Column stats: NONE > value expressions: _col0 (type: bigint), _col1 > (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: > bigint), _col5 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), > _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 > (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string) > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: act > filterExpr: (((current_ind = 'Y') and (rollup_key = > 'TOTACT')) and activity_id is not null) (type: boolean) > Statistics: Num rows: 40683 Data size: 271025472 Basic > stats: COMPLETE Column stats: NONE > Filter Operator > predicate: (((current_ind = 'Y') and (rollup_key = > 'TOTACT')) and activity_id is not null) (type: boolean) > Statistics: Num rows: 5085 Data size: 33875685 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: lvl5_id (type: string), activity_id (type: > bigint) > outputColumnNames: _col0, _col3 > Statistics: Num rows: 5085 Data size: 33875685 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col3 (type: bigint) > sort order: + > Map-reduce partition columns: _col3 (type: bigint) > Statistics: Num rows: 5085 Data size: 33875685 Basic > stats: COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Execution mode: vectorized > Map 5 > Map Operator Tree: > TableScan > alias: fprs > filterExpr: ((current_ind = 'Y') and fprs_id is not null) > (type: boolean) > Statistics: Num rows: 1589540 Data size: 6862044180 Basic > stats: COMPLETE Column stats: NONE > Filter Operator > predicate: ((current_ind = 'Y') and fprs_id is not null) > (type: boolean) > Statistics: Num rows: 397385 Data size: 1715511045 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: lvl5_id (type: bigint), fprs_id (type: > bigint) > outputColumnNames: _col0, _col2 > Statistics: Num rows: 397385 Data size: 1715511045 > Basic stats: COMPLETE Column stats: NONE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col12 (type: bigint) > 1 _col2 (type: bigint) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, > _col18 > input vertices: > 0 Map 3 > Statistics: Num rows: 519549 Data size: 158982244 > Basic stats: COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: bigint), _col1 (type: > bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), > _col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint), > _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 > (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: > bigint), _col9 (type: bigint) > outputColumnNames: _col0, _col1, _col10, _col11, > _col12, _col13, _col14, _col18, _col2, _col3, _col4, _col5, _col6, _col7, > _col8, _col9 > Statistics: Num rows: 519549 Data size: 158982244 > Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col1 (type: bigint), _col8 > (type: bigint), _col12 (type: bigint), _col5 (type: bigint) > sort order: ++++ > Map-reduce partition columns: _col1 (type: > bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) > Statistics: Num rows: 519549 Data size: 158982244 > Basic stats: COMPLETE Column stats: NONE > value expressions: _col0 (type: bigint), _col2 > (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col6 (type: > bigint), _col7 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), > _col11 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), > _col18 (type: bigint) > Execution mode: vectorized > Reducer 2 > Reduce Operator Tree: > Group By Operator > aggregations: sum(VALUE._col0) > keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint), > KEY._col2 (type: bigint), KEY._col3 (type: bigint), KEY._col4 (type: bigint), > KEY._col5 (type: string), KEY._col6 (type: bigint), KEY._col7 (type: bigint), > KEY._col8 (type: bigint), KEY._col9 (type: bigint), KEY._col10 (type: > bigint), KEY._col11 (type: bigint), KEY._col12 (type: bigint), KEY._col13 > (type: bigint), KEY._col14 (type: bigint) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, > _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 > Statistics: Num rows: 100594872 Data size: 16095179588 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: bigint), _col1 (type: bigint), > _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), > UDFToLong(_col5) (type: bigint), _col6 (type: bigint), _col7 (type: bigint), > _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 > (type: bigint), _col12 (type: bigint), _col13 (type: bigint), > UDFToString(_col14) (type: string), CAST( _col15 AS decimal(28,12)) (type: > decimal(28,12)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, > _col15 > Statistics: Num rows: 100594872 Data size: 16095179588 > Basic stats: COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 100594872 Data size: 16095179588 > Basic stats: COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: fidelity.fcap_agg_part_exp_hive_decimal_decimal > Stage: Stage-2 > Dependency Collection > Stage: Stage-0 > Move Operator > tables: > replace: false > table: > input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde > name: fidelity.fcap_agg_part_exp_hive_decimal_decimal > Stage: Stage-3 > Stats-Aggr Operator > {code} > Query used > {code} > explain insert INTO table fidelity.FCAP_AGG_PART_EXP_hive_decimal_decimal > SELECT BALANCE.SCALE, BALANCE.TIME_KEY_NUM, BALANCE.DATASET_CODE, > BALANCE.Cost_Center_LVL1_ID, BALANCE.Cost_Pool_LVL6_ID, act.lvl5_ID, > BALANCE.VIEW_LVL1_ID, BALANCE.FROM_LVL1_ID, BALANCE.PLAN_ID, > BALANCE.CLIENT_ID, BALANCE.LOB_ID, BALANCE.PRODUCT_ID, fprs.lvl5_ID > fprs_lvl5_id, DRIVER.SSN_ID, DRIVER.ACCOUNT_ID, SUM ( BALANCE.MTD_BALANCE * > DRIVER.DRIVER_PCT) AS MTD_BALANCE FROM > fidelity.FCAP_AGG_PROD_EXP_NOFUND_decimal BALANCE JOIN > fidelity.fcap_drivers_part_exp_inter_bucket_256 DRIVER ON > BALANCE.TIME_KEY_NUM = DRIVER.TIME_KEY_NUM AND BALANCE.PLAN_ID = > DRIVER.PLAN_ID AND BALANCE.FPRS_ID = DRIVER.FPRS_ID AND BALANCE.ACTIVITY_ID = > DRIVER.ACTIVITY_ID INNER JOIN fidelity.fobi_fprs_dim_mv_orc fprs ON > balance.FPRS_ID = fprs.fprs_id AND fprs.current_ind = 'Y' inner join > fidelity.fobi_activity_dim_mv act ON BALANCE.activity_id = act.activity_id > AND act.current_ind = 'Y' AND act.ROLLUP_KEY = 'TOTACT' GROUP BY > BALANCE.SCALE, BALANCE.TIME_KEY_NUM, BALANCE.DATASET_CODE, > BALANCE.Cost_center_LVL1_ID, BALANCE.Cost_Pool_LVL6_ID, act.lvl5_ID, > BALANCE.VIEW_LVL1_ID, BALANCE.FROM_LVL1_ID, BALANCE.PLAN_ID, > BALANCE.CLIENT_ID, BALANCE.LOB_ID, BALANCE.PRODUCT_ID, fprs.lvl5_ID, > DRIVER.SSN_ID, DRIVER.ACCOUNT_ID; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)