Mostafa Mokhtar created HIVE-9368: ------------------------------------- Summary: Physical optimizer : Join order in Explain is different from join order provided by Calcite Key: HIVE-9368 URL: https://issues.apache.org/jira/browse/HIVE-9368 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Vikram Dixit K Fix For: 0.15.0
Join order in explain is different from that provided by Calcite, this was observed during the Fidelity POC. Logical plan from Calcite : {code} 2015-01-13 18:54:42,892 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(743)) - Plan After Join Reordering: HiveProject(scale=[$0], time_key_num=[$1], dataset_code=[$2], cost_center_lvl1_id=[$3], cost_pool_lvl6_id=[$4], lvl5_id=[$5], view_lvl1_id=[$6], from_lvl1_id=[$7], plan_id=[$8], client_id=[$9], lob_id=[$10], product_id=[$11], fprs_lvl5_id=[$12], ssn_id=[$13], account_id=[$14], mtd_balance=[$15]): rowcount = 2.53152774E8, cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 636 HiveAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}], agg#0=[SUM($15)]): rowcount = 2.53152774E8, cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 634 HiveProject($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$3], $f4=[$4], $f5=[$24], $f6=[$6], $f7=[$7], $f8=[$8], $f9=[$9], $f10=[$10], $f11=[$11], $f12=[$21], $f13=[$18], $f14=[$19], $f15=[*($13, $20)]): rowcount = 3.401053197411791E11, cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 632 HiveProject(scale=[$7], time_key_num=[$8], dataset_code=[$9], cost_center_lvl1_id=[$10], cost_pool_lvl6_id=[$11], activity_id=[$12], view_lvl1_id=[$13], from_lvl1_id=[$14], plan_id=[$15], client_id=[$16], lob_id=[$17], product_id=[$18], fprs_id=[$19], mtd_balance=[$20], time_key_num0=[$0], activity_id0=[$1], plan_id0=[$2], fprs_id0=[$3], ssn_id=[$4], account_id=[$5], driver_pct=[$6], lvl5_id=[$25], current_ind=[$26], fprs_id1=[$27], lvl5_id0=[$21], rollup_key=[$22], current_ind0=[$23], activity_id1=[$24]): rowcount = 3.401053197411791E11, cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 692 HiveJoin(condition=[AND(AND(AND(=($8, $0), =($15, $2)), =($19, $3)), =($12, $1))], joinType=[inner]): rowcount = 3.401053197411791E11, cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 690 HiveProject(time_key_num=[$0], activity_id=[$1], plan_id=[$2], fprs_id=[$3], ssn_id=[$4], account_id=[$5], driver_pct=[$6]): rowcount = 2.926396239E9, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 590 HiveTableScan(table=[[fidelity.fcap_drivers_part_exp_inter_bucket_256]]): rowcount = 2.926396239E9, cumulative cost = {0}, id = 465 HiveJoin(condition=[=($12, $20)], joinType=[inner]): rowcount = 1.0871372980143067E8, cumulative cost = {2.2067125966323376E7 rows, 0.0 cpu, 0.0 io}, id = 688 HiveJoin(condition=[=($5, $17)], joinType=[inner]): rowcount = 1.4392118216323378E7, cumulative cost = {6880237.75 rows, 0.0 cpu, 0.0 io}, id = 653 HiveProject(scale=[$0], time_key_num=[$1], dataset_code=[$2], cost_center_lvl1_id=[$3], cost_pool_lvl6_id=[$4], activity_id=[$5], view_lvl1_id=[$6], from_lvl1_id=[$7], plan_id=[$8], client_id=[$9], lob_id=[$10], product_id=[$11], fprs_id=[$12], mtd_balance=[$14]): rowcount = 6870067.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 587 HiveTableScan(table=[[fidelity.fcap_agg_prod_exp_nofund_decimal]]): rowcount = 6870067.0, cumulative cost = {0}, id = 464 HiveProject(lvl5_id=[$36], rollup_key=[$48], current_ind=[$51], activity_id=[$60]): rowcount = 10170.75, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 628 HiveFilter(condition=[AND(=($51, 'Y'), =($48, 'TOTACT'))]): rowcount = 10170.75, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 626 HiveTableScan(table=[[fidelity.fobi_activity_dim_mv]]): rowcount = 40683.0, cumulative cost = {0}, id = 467 HiveProject(lvl5_id=[$36], current_ind=[$51], fprs_id=[$58]): rowcount = 794770.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 622 HiveFilter(condition=[=($51, 'Y')]): rowcount = 794770.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 620 HiveTableScan(table=[[fidelity.fobi_fprs_dim_mv_orc]]): rowcount = 1589540.0, cumulative cost = {0}, id = 466 {code} Plan #1 with Fetch column stats on {code} OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 Stage-3 depends on stages: Stage-0 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 4 <- Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) Reducer 3 <- Reducer 2 (SIMPLE_EDGE) DagName: mmokhtar_20150113185454_d7ce6ecf-2d50-45ed-8a88-6283bb091b0e:3 Vertices: Map 1 Map Operator Tree: TableScan alias: driver filterExpr: (((time_key_num is not null and plan_id is not null) and fprs_id is not null) and activity_id is not null) (type: boolean) Statistics: Num rows: 2926396239 Data size: 468223398240 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((time_key_num is not null and plan_id is not null) and fprs_id is not null) and activity_id is not null) (type: boolean) Statistics: Num rows: 2926396239 Data size: 468223398240 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: time_key_num (type: bigint), activity_id (type: bigint), plan_id (type: bigint), fprs_id (type: bigint), ssn_id (type: bigint), account_id (type: bigint), driver_pct (type: decimal(28,12)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 Statistics: Num rows: 2926396239 Data size: 468223398240 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col1 (type: bigint) sort order: ++++ Map-reduce partition columns: _col0 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col1 (type: bigint) Statistics: Num rows: 2926396239 Data size: 468223398240 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col4 (type: bigint), _col5 (type: bigint), _col6 (type: decimal(28,12)) Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: balance filterExpr: (((activity_id is not null and fprs_id is not null) and time_key_num is not null) and plan_id is not null) (type: boolean) Statistics: Num rows: 6870067 Data size: 2102240502 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((activity_id is not null and fprs_id is not null) and time_key_num is not null) and plan_id is not null) (type: boolean) Statistics: Num rows: 6870067 Data size: 1483934472 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: scale (type: bigint), time_key_num (type: bigint), dataset_code (type: bigint), cost_center_lvl1_id (type: bigint), cost_pool_lvl6_id (type: bigint), activity_id (type: bigint), view_lvl1_id (type: bigint), from_lvl1_id (type: bigint), plan_id (type: bigint), client_id (type: bigint), lob_id (type: bigint), product_id (type: bigint), fprs_id (type: bigint), mtd_balance (type: decimal(28,12)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 Statistics: Num rows: 6870067 Data size: 1483934472 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col5 (type: bigint) 1 _col3 (type: bigint) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 input vertices: 1 Map 5 Statistics: Num rows: 57555202 Data size: 17151450196 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col12 (type: bigint) 1 _col2 (type: bigint) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col18 input vertices: 1 Map 6 Statistics: Num rows: 869509350 Data size: 266069861100 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col0 (type: bigint), _col1 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint) outputColumnNames: _col0, _col1, _col10, _col11, _col12, _col13, _col14, _col18, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 869509350 Data size: 266069861100 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) sort order: ++++ Map-reduce partition columns: _col1 (type: bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) Statistics: Num rows: 869509350 Data size: 266069861100 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint) Execution mode: vectorized Map 5 Map Operator Tree: TableScan alias: act filterExpr: (((current_ind = 'Y') and (rollup_key = 'TOTACT')) and activity_id is not null) (type: boolean) Statistics: Num rows: 40683 Data size: 271025472 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((current_ind = 'Y') and (rollup_key = 'TOTACT')) and activity_id is not null) (type: boolean) Statistics: Num rows: 20341 Data size: 5553093 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: lvl5_id (type: string), activity_id (type: bigint) outputColumnNames: _col0, _col3 Statistics: Num rows: 20341 Data size: 1993418 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col3 (type: bigint) sort order: + Map-reduce partition columns: _col3 (type: bigint) Statistics: Num rows: 20341 Data size: 1993418 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string) Execution mode: vectorized Map 6 Map Operator Tree: TableScan alias: fprs filterExpr: ((current_ind = 'Y') and fprs_id is not null) (type: boolean) Statistics: Num rows: 1589540 Data size: 6862044180 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((current_ind = 'Y') and fprs_id is not null) (type: boolean) Statistics: Num rows: 794770 Data size: 80271770 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: lvl5_id (type: bigint), fprs_id (type: bigint) outputColumnNames: _col0, _col2 Statistics: Num rows: 794770 Data size: 12716320 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col2 (type: bigint) sort order: + Map-reduce partition columns: _col2 (type: bigint) Statistics: Num rows: 794770 Data size: 12716320 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: bigint) Execution mode: vectorized Reducer 2 Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 keys: 0 _col0 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col1 (type: bigint) 1 _col1 (type: bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) outputColumnNames: _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col13, _col14, _col15, _col16, _col17, _col18, _col20, _col21, _col25 Statistics: Num rows: 28151027141 Data size: 8389006088018 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col21 (type: string), _col13 (type: bigint), _col14 (type: bigint), _col15 (type: bigint), _col16 (type: bigint), _col17 (type: bigint), _col18 (type: bigint), _col25 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), (_col20 * _col6) (type: decimal(38,24)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 28151027141 Data size: 8389006088018 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: sum(_col15) keys: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 28151027141 Data size: 8389006088018 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) sort order: +++++++++++++++ Map-reduce partition columns: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) Statistics: Num rows: 28151027141 Data size: 8389006088018 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col15 (type: decimal(38,24)) Reducer 3 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint), KEY._col2 (type: bigint), KEY._col3 (type: bigint), KEY._col4 (type: bigint), KEY._col5 (type: string), KEY._col6 (type: bigint), KEY._col7 (type: bigint), KEY._col8 (type: bigint), KEY._col9 (type: bigint), KEY._col10 (type: bigint), KEY._col11 (type: bigint), KEY._col12 (type: bigint), KEY._col13 (type: bigint), KEY._col14 (type: bigint) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 28151027141 Data size: 8389006088018 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), UDFToLong(_col5) (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), UDFToString(_col14) (type: string), CAST( _col15 AS decimal(28,12)) (type: decimal(28,12)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 28151027141 Data size: 11260410856400 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false Statistics: Num rows: 28151027141 Data size: 11260410856400 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: fidelity.fcap_agg_part_exp_hive_decimal_decimal Stage: Stage-2 Dependency Collection Stage: Stage-0 Move Operator tables: replace: false table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: fidelity.fcap_agg_part_exp_hive_decimal_decimal Stage: Stage-3 Stats-Aggr Operator {code} Plan #2 with fetch column stats off {code}STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 Stage-3 depends on stages: Stage-0 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 <- Map 5 (BROADCAST_EDGE) Map 3 <- Map 4 (BROADCAST_EDGE) Map 5 <- Map 3 (BROADCAST_EDGE) Reducer 2 <- Map 1 (SIMPLE_EDGE) DagName: mmokhtar_20150113185454_7c350b7b-53fa-4bbb-a747-24ec0c2a6ab7:4 Vertices: Map 1 Map Operator Tree: TableScan alias: driver filterExpr: (((time_key_num is not null and plan_id is not null) and fprs_id is not null) and activity_id is not null) (type: boolean) Statistics: Num rows: 2926396239 Data size: 468223398240 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((time_key_num is not null and plan_id is not null) and fprs_id is not null) and activity_id is not null) (type: boolean) Statistics: Num rows: 182899765 Data size: 29263962400 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: time_key_num (type: bigint), activity_id (type: bigint), plan_id (type: bigint), fprs_id (type: bigint), ssn_id (type: bigint), account_id (type: bigint), driver_pct (type: decimal(28,12)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 Statistics: Num rows: 182899765 Data size: 29263962400 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col0 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col1 (type: bigint) 1 _col1 (type: bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) outputColumnNames: _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col13, _col14, _col15, _col16, _col17, _col18, _col20, _col21, _col25 input vertices: 1 Map 5 Statistics: Num rows: 201189745 Data size: 32190359337 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col21 (type: string), _col13 (type: bigint), _col14 (type: bigint), _col15 (type: bigint), _col16 (type: bigint), _col17 (type: bigint), _col18 (type: bigint), _col25 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), (_col20 * _col6) (type: decimal(38,24)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 201189745 Data size: 32190359337 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: sum(_col15) keys: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 201189745 Data size: 32190359337 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) sort order: +++++++++++++++ Map-reduce partition columns: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint) Statistics: Num rows: 201189745 Data size: 32190359337 Basic stats: COMPLETE Column stats: NONE value expressions: _col15 (type: decimal(38,24)) Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: balance filterExpr: (((activity_id is not null and fprs_id is not null) and time_key_num is not null) and plan_id is not null) (type: boolean) Statistics: Num rows: 6870067 Data size: 2102240502 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((activity_id is not null and fprs_id is not null) and time_key_num is not null) and plan_id is not null) (type: boolean) Statistics: Num rows: 429380 Data size: 131390279 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: scale (type: bigint), time_key_num (type: bigint), dataset_code (type: bigint), cost_center_lvl1_id (type: bigint), cost_pool_lvl6_id (type: bigint), activity_id (type: bigint), view_lvl1_id (type: bigint), from_lvl1_id (type: bigint), plan_id (type: bigint), client_id (type: bigint), lob_id (type: bigint), product_id (type: bigint), fprs_id (type: bigint), mtd_balance (type: decimal(28,12)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13 Statistics: Num rows: 429380 Data size: 131390279 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col5 (type: bigint) 1 _col3 (type: bigint) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 input vertices: 1 Map 4 Statistics: Num rows: 472318 Data size: 144529310 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col12 (type: bigint) sort order: + Map-reduce partition columns: _col12 (type: bigint) Statistics: Num rows: 472318 Data size: 144529310 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string) Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: act filterExpr: (((current_ind = 'Y') and (rollup_key = 'TOTACT')) and activity_id is not null) (type: boolean) Statistics: Num rows: 40683 Data size: 271025472 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((current_ind = 'Y') and (rollup_key = 'TOTACT')) and activity_id is not null) (type: boolean) Statistics: Num rows: 5085 Data size: 33875685 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: lvl5_id (type: string), activity_id (type: bigint) outputColumnNames: _col0, _col3 Statistics: Num rows: 5085 Data size: 33875685 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col3 (type: bigint) sort order: + Map-reduce partition columns: _col3 (type: bigint) Statistics: Num rows: 5085 Data size: 33875685 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string) Execution mode: vectorized Map 5 Map Operator Tree: TableScan alias: fprs filterExpr: ((current_ind = 'Y') and fprs_id is not null) (type: boolean) Statistics: Num rows: 1589540 Data size: 6862044180 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((current_ind = 'Y') and fprs_id is not null) (type: boolean) Statistics: Num rows: 397385 Data size: 1715511045 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: lvl5_id (type: bigint), fprs_id (type: bigint) outputColumnNames: _col0, _col2 Statistics: Num rows: 397385 Data size: 1715511045 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col12 (type: bigint) 1 _col2 (type: bigint) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col18 input vertices: 0 Map 3 Statistics: Num rows: 519549 Data size: 158982244 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: bigint), _col1 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint) outputColumnNames: _col0, _col1, _col10, _col11, _col12, _col13, _col14, _col18, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 519549 Data size: 158982244 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col1 (type: bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) sort order: ++++ Map-reduce partition columns: _col1 (type: bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint) Statistics: Num rows: 519549 Data size: 158982244 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint) Execution mode: vectorized Reducer 2 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0) keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint), KEY._col2 (type: bigint), KEY._col3 (type: bigint), KEY._col4 (type: bigint), KEY._col5 (type: string), KEY._col6 (type: bigint), KEY._col7 (type: bigint), KEY._col8 (type: bigint), KEY._col9 (type: bigint), KEY._col10 (type: bigint), KEY._col11 (type: bigint), KEY._col12 (type: bigint), KEY._col13 (type: bigint), KEY._col14 (type: bigint) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 100594872 Data size: 16095179588 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), UDFToLong(_col5) (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), UDFToString(_col14) (type: string), CAST( _col15 AS decimal(28,12)) (type: decimal(28,12)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 100594872 Data size: 16095179588 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 100594872 Data size: 16095179588 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: fidelity.fcap_agg_part_exp_hive_decimal_decimal Stage: Stage-2 Dependency Collection Stage: Stage-0 Move Operator tables: replace: false table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: fidelity.fcap_agg_part_exp_hive_decimal_decimal Stage: Stage-3 Stats-Aggr Operator {code} Query used {code} explain insert INTO table fidelity.FCAP_AGG_PART_EXP_hive_decimal_decimal SELECT BALANCE.SCALE, BALANCE.TIME_KEY_NUM, BALANCE.DATASET_CODE, BALANCE.Cost_Center_LVL1_ID, BALANCE.Cost_Pool_LVL6_ID, act.lvl5_ID, BALANCE.VIEW_LVL1_ID, BALANCE.FROM_LVL1_ID, BALANCE.PLAN_ID, BALANCE.CLIENT_ID, BALANCE.LOB_ID, BALANCE.PRODUCT_ID, fprs.lvl5_ID fprs_lvl5_id, DRIVER.SSN_ID, DRIVER.ACCOUNT_ID, SUM ( BALANCE.MTD_BALANCE * DRIVER.DRIVER_PCT) AS MTD_BALANCE FROM fidelity.FCAP_AGG_PROD_EXP_NOFUND_decimal BALANCE JOIN fidelity.fcap_drivers_part_exp_inter_bucket_256 DRIVER ON BALANCE.TIME_KEY_NUM = DRIVER.TIME_KEY_NUM AND BALANCE.PLAN_ID = DRIVER.PLAN_ID AND BALANCE.FPRS_ID = DRIVER.FPRS_ID AND BALANCE.ACTIVITY_ID = DRIVER.ACTIVITY_ID INNER JOIN fidelity.fobi_fprs_dim_mv_orc fprs ON balance.FPRS_ID = fprs.fprs_id AND fprs.current_ind = 'Y' inner join fidelity.fobi_activity_dim_mv act ON BALANCE.activity_id = act.activity_id AND act.current_ind = 'Y' AND act.ROLLUP_KEY = 'TOTACT' GROUP BY BALANCE.SCALE, BALANCE.TIME_KEY_NUM, BALANCE.DATASET_CODE, BALANCE.Cost_center_LVL1_ID, BALANCE.Cost_Pool_LVL6_ID, act.lvl5_ID, BALANCE.VIEW_LVL1_ID, BALANCE.FROM_LVL1_ID, BALANCE.PLAN_ID, BALANCE.CLIENT_ID, BALANCE.LOB_ID, BALANCE.PRODUCT_ID, fprs.lvl5_ID, DRIVER.SSN_ID, DRIVER.ACCOUNT_ID; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)