Mostafa Mokhtar created HIVE-9368:
-------------------------------------

             Summary: Physical optimizer : Join order in Explain is different 
from join order provided by Calcite
                 Key: HIVE-9368
                 URL: https://issues.apache.org/jira/browse/HIVE-9368
             Project: Hive
          Issue Type: Bug
          Components: Physical Optimizer
    Affects Versions: 0.14.0
            Reporter: Mostafa Mokhtar
            Assignee: Vikram Dixit K
             Fix For: 0.15.0


Join order in explain is different from that provided by Calcite, this was 
observed during the Fidelity POC. 

Logical plan from Calcite :
{code}
2015-01-13 18:54:42,892 DEBUG [main]: parse.CalcitePlanner 
(CalcitePlanner.java:apply(743)) - Plan After Join Reordering:
HiveProject(scale=[$0], time_key_num=[$1], dataset_code=[$2], 
cost_center_lvl1_id=[$3], cost_pool_lvl6_id=[$4], lvl5_id=[$5], 
view_lvl1_id=[$6], from_lvl1_id=[$7], plan_id=[$8], client_id=[$9], 
lob_id=[$10], product_id=[$11], fprs_lvl5_id=[$12], ssn_id=[$13], 
account_id=[$14], mtd_balance=[$15]): rowcount = 2.53152774E8, cumulative cost 
= {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 636
  HiveAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}], 
agg#0=[SUM($15)]): rowcount = 2.53152774E8, cumulative cost = 
{3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 634
    HiveProject($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$3], $f4=[$4], $f5=[$24], 
$f6=[$6], $f7=[$7], $f8=[$8], $f9=[$9], $f10=[$10], $f11=[$11], $f12=[$21], 
$f13=[$18], $f14=[$19], $f15=[*($13, $20)]): rowcount = 3.401053197411791E11, 
cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 632
      HiveProject(scale=[$7], time_key_num=[$8], dataset_code=[$9], 
cost_center_lvl1_id=[$10], cost_pool_lvl6_id=[$11], activity_id=[$12], 
view_lvl1_id=[$13], from_lvl1_id=[$14], plan_id=[$15], client_id=[$16], 
lob_id=[$17], product_id=[$18], fprs_id=[$19], mtd_balance=[$20], 
time_key_num0=[$0], activity_id0=[$1], plan_id0=[$2], fprs_id0=[$3], 
ssn_id=[$4], account_id=[$5], driver_pct=[$6], lvl5_id=[$25], 
current_ind=[$26], fprs_id1=[$27], lvl5_id0=[$21], rollup_key=[$22], 
current_ind0=[$23], activity_id1=[$24]): rowcount = 3.401053197411791E11, 
cumulative cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 692
        HiveJoin(condition=[AND(AND(AND(=($8, $0), =($15, $2)), =($19, $3)), 
=($12, $1))], joinType=[inner]): rowcount = 3.401053197411791E11, cumulative 
cost = {3.057177094767754E9 rows, 0.0 cpu, 0.0 io}, id = 690
          HiveProject(time_key_num=[$0], activity_id=[$1], plan_id=[$2], 
fprs_id=[$3], ssn_id=[$4], account_id=[$5], driver_pct=[$6]): rowcount = 
2.926396239E9, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 590
            
HiveTableScan(table=[[fidelity.fcap_drivers_part_exp_inter_bucket_256]]): 
rowcount = 2.926396239E9, cumulative cost = {0}, id = 465
          HiveJoin(condition=[=($12, $20)], joinType=[inner]): rowcount = 
1.0871372980143067E8, cumulative cost = {2.2067125966323376E7 rows, 0.0 cpu, 
0.0 io}, id = 688
            HiveJoin(condition=[=($5, $17)], joinType=[inner]): rowcount = 
1.4392118216323378E7, cumulative cost = {6880237.75 rows, 0.0 cpu, 0.0 io}, id 
= 653
              HiveProject(scale=[$0], time_key_num=[$1], dataset_code=[$2], 
cost_center_lvl1_id=[$3], cost_pool_lvl6_id=[$4], activity_id=[$5], 
view_lvl1_id=[$6], from_lvl1_id=[$7], plan_id=[$8], client_id=[$9], 
lob_id=[$10], product_id=[$11], fprs_id=[$12], mtd_balance=[$14]): rowcount = 
6870067.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 587
                
HiveTableScan(table=[[fidelity.fcap_agg_prod_exp_nofund_decimal]]): rowcount = 
6870067.0, cumulative cost = {0}, id = 464
              HiveProject(lvl5_id=[$36], rollup_key=[$48], current_ind=[$51], 
activity_id=[$60]): rowcount = 10170.75, cumulative cost = {0.0 rows, 0.0 cpu, 
0.0 io}, id = 628
                HiveFilter(condition=[AND(=($51, 'Y'), =($48, 'TOTACT'))]): 
rowcount = 10170.75, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 626
                  HiveTableScan(table=[[fidelity.fobi_activity_dim_mv]]): 
rowcount = 40683.0, cumulative cost = {0}, id = 467
            HiveProject(lvl5_id=[$36], current_ind=[$51], fprs_id=[$58]): 
rowcount = 794770.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 622
              HiveFilter(condition=[=($51, 'Y')]): rowcount = 794770.0, 
cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 620
                HiveTableScan(table=[[fidelity.fobi_fprs_dim_mv_orc]]): 
rowcount = 1589540.0, cumulative cost = {0}, id = 466
{code}

Plan #1 with Fetch column stats on 
{code}
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-0 depends on stages: Stage-2
  Stage-3 depends on stages: Stage-0

STAGE PLANS:
  Stage: Stage-1
    Tez
      Edges:
        Map 4 <- Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
        Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
        Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
      DagName: mmokhtar_20150113185454_d7ce6ecf-2d50-45ed-8a88-6283bb091b0e:3
      Vertices:
        Map 1
            Map Operator Tree:
                TableScan
                  alias: driver
                  filterExpr: (((time_key_num is not null and plan_id is not 
null) and fprs_id is not null) and activity_id is not null) (type: boolean)
                  Statistics: Num rows: 2926396239 Data size: 468223398240 
Basic stats: COMPLETE Column stats: COMPLETE
                  Filter Operator
                    predicate: (((time_key_num is not null and plan_id is not 
null) and fprs_id is not null) and activity_id is not null) (type: boolean)
                    Statistics: Num rows: 2926396239 Data size: 468223398240 
Basic stats: COMPLETE Column stats: COMPLETE
                    Select Operator
                      expressions: time_key_num (type: bigint), activity_id 
(type: bigint), plan_id (type: bigint), fprs_id (type: bigint), ssn_id (type: 
bigint), account_id (type: bigint), driver_pct (type: decimal(28,12))
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6
                      Statistics: Num rows: 2926396239 Data size: 468223398240 
Basic stats: COMPLETE Column stats: COMPLETE
                      Reduce Output Operator
                        key expressions: _col0 (type: bigint), _col2 (type: 
bigint), _col3 (type: bigint), _col1 (type: bigint)
                        sort order: ++++
                        Map-reduce partition columns: _col0 (type: bigint), 
_col2 (type: bigint), _col3 (type: bigint), _col1 (type: bigint)
                        Statistics: Num rows: 2926396239 Data size: 
468223398240 Basic stats: COMPLETE Column stats: COMPLETE
                        value expressions: _col4 (type: bigint), _col5 (type: 
bigint), _col6 (type: decimal(28,12))
            Execution mode: vectorized
        Map 4
            Map Operator Tree:
                TableScan
                  alias: balance
                  filterExpr: (((activity_id is not null and fprs_id is not 
null) and time_key_num is not null) and plan_id is not null) (type: boolean)
                  Statistics: Num rows: 6870067 Data size: 2102240502 Basic 
stats: COMPLETE Column stats: COMPLETE
                  Filter Operator
                    predicate: (((activity_id is not null and fprs_id is not 
null) and time_key_num is not null) and plan_id is not null) (type: boolean)
                    Statistics: Num rows: 6870067 Data size: 1483934472 Basic 
stats: COMPLETE Column stats: COMPLETE
                    Select Operator
                      expressions: scale (type: bigint), time_key_num (type: 
bigint), dataset_code (type: bigint), cost_center_lvl1_id (type: bigint), 
cost_pool_lvl6_id (type: bigint), activity_id (type: bigint), view_lvl1_id 
(type: bigint), from_lvl1_id (type: bigint), plan_id (type: bigint), client_id 
(type: bigint), lob_id (type: bigint), product_id (type: bigint), fprs_id 
(type: bigint), mtd_balance (type: decimal(28,12))
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13
                      Statistics: Num rows: 6870067 Data size: 1483934472 Basic 
stats: COMPLETE Column stats: COMPLETE
                      Map Join Operator
                        condition map:
                             Inner Join 0 to 1
                        keys:
                          0 _col5 (type: bigint)
                          1 _col3 (type: bigint)
                        outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14
                        input vertices:
                          1 Map 5
                        Statistics: Num rows: 57555202 Data size: 17151450196 
Basic stats: COMPLETE Column stats: COMPLETE
                        Map Join Operator
                          condition map:
                               Inner Join 0 to 1
                          keys:
                            0 _col12 (type: bigint)
                            1 _col2 (type: bigint)
                          outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, 
_col18
                          input vertices:
                            1 Map 6
                          Statistics: Num rows: 869509350 Data size: 
266069861100 Basic stats: COMPLETE Column stats: COMPLETE
                          Select Operator
                            expressions: _col0 (type: bigint), _col1 (type: 
bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), 
_col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint), 
_col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: 
bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), 
_col9 (type: bigint)
                            outputColumnNames: _col0, _col1, _col10, _col11, 
_col12, _col13, _col14, _col18, _col2, _col3, _col4, _col5, _col6, _col7, 
_col8, _col9
                            Statistics: Num rows: 869509350 Data size: 
266069861100 Basic stats: COMPLETE Column stats: COMPLETE
                            Reduce Output Operator
                              key expressions: _col1 (type: bigint), _col8 
(type: bigint), _col12 (type: bigint), _col5 (type: bigint)
                              sort order: ++++
                              Map-reduce partition columns: _col1 (type: 
bigint), _col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint)
                              Statistics: Num rows: 869509350 Data size: 
266069861100 Basic stats: COMPLETE Column stats: COMPLETE
                              value expressions: _col0 (type: bigint), _col2 
(type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col6 (type: 
bigint), _col7 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), 
_col11 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), 
_col18 (type: bigint)
            Execution mode: vectorized
        Map 5
            Map Operator Tree:
                TableScan
                  alias: act
                  filterExpr: (((current_ind = 'Y') and (rollup_key = 
'TOTACT')) and activity_id is not null) (type: boolean)
                  Statistics: Num rows: 40683 Data size: 271025472 Basic stats: 
COMPLETE Column stats: COMPLETE
                  Filter Operator
                    predicate: (((current_ind = 'Y') and (rollup_key = 
'TOTACT')) and activity_id is not null) (type: boolean)
                    Statistics: Num rows: 20341 Data size: 5553093 Basic stats: 
COMPLETE Column stats: COMPLETE
                    Select Operator
                      expressions: lvl5_id (type: string), activity_id (type: 
bigint)
                      outputColumnNames: _col0, _col3
                      Statistics: Num rows: 20341 Data size: 1993418 Basic 
stats: COMPLETE Column stats: COMPLETE
                      Reduce Output Operator
                        key expressions: _col3 (type: bigint)
                        sort order: +
                        Map-reduce partition columns: _col3 (type: bigint)
                        Statistics: Num rows: 20341 Data size: 1993418 Basic 
stats: COMPLETE Column stats: COMPLETE
                        value expressions: _col0 (type: string)
            Execution mode: vectorized
        Map 6
            Map Operator Tree:
                TableScan
                  alias: fprs
                  filterExpr: ((current_ind = 'Y') and fprs_id is not null) 
(type: boolean)
                  Statistics: Num rows: 1589540 Data size: 6862044180 Basic 
stats: COMPLETE Column stats: COMPLETE
                  Filter Operator
                    predicate: ((current_ind = 'Y') and fprs_id is not null) 
(type: boolean)
                    Statistics: Num rows: 794770 Data size: 80271770 Basic 
stats: COMPLETE Column stats: COMPLETE
                    Select Operator
                      expressions: lvl5_id (type: bigint), fprs_id (type: 
bigint)
                      outputColumnNames: _col0, _col2
                      Statistics: Num rows: 794770 Data size: 12716320 Basic 
stats: COMPLETE Column stats: COMPLETE
                      Reduce Output Operator
                        key expressions: _col2 (type: bigint)
                        sort order: +
                        Map-reduce partition columns: _col2 (type: bigint)
                        Statistics: Num rows: 794770 Data size: 12716320 Basic 
stats: COMPLETE Column stats: COMPLETE
                        value expressions: _col0 (type: bigint)
            Execution mode: vectorized
        Reducer 2
            Reduce Operator Tree:
              Merge Join Operator
                condition map:
                     Inner Join 0 to 1
                keys:
                  0 _col0 (type: bigint), _col2 (type: bigint), _col3 (type: 
bigint), _col1 (type: bigint)
                  1 _col1 (type: bigint), _col8 (type: bigint), _col12 (type: 
bigint), _col5 (type: bigint)
                outputColumnNames: _col4, _col5, _col6, _col7, _col8, _col9, 
_col10, _col11, _col13, _col14, _col15, _col16, _col17, _col18, _col20, _col21, 
_col25
                Statistics: Num rows: 28151027141 Data size: 8389006088018 
Basic stats: COMPLETE Column stats: COMPLETE
                Select Operator
                  expressions: _col7 (type: bigint), _col8 (type: bigint), 
_col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col21 
(type: string), _col13 (type: bigint), _col14 (type: bigint), _col15 (type: 
bigint), _col16 (type: bigint), _col17 (type: bigint), _col18 (type: bigint), 
_col25 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), (_col20 * 
_col6) (type: decimal(38,24))
                  outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15
                  Statistics: Num rows: 28151027141 Data size: 8389006088018 
Basic stats: COMPLETE Column stats: COMPLETE
                  Group By Operator
                    aggregations: sum(_col15)
                    keys: _col0 (type: bigint), _col1 (type: bigint), _col2 
(type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: 
string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), 
_col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 
(type: bigint), _col13 (type: bigint), _col14 (type: bigint)
                    mode: hash
                    outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, 
_col15
                    Statistics: Num rows: 28151027141 Data size: 8389006088018 
Basic stats: COMPLETE Column stats: COMPLETE
                    Reduce Output Operator
                      key expressions: _col0 (type: bigint), _col1 (type: 
bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), 
_col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: 
bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), 
_col12 (type: bigint), _col13 (type: bigint), _col14 (type: bigint)
                      sort order: +++++++++++++++
                      Map-reduce partition columns: _col0 (type: bigint), _col1 
(type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: 
bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), 
_col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 
(type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: 
bigint)
                      Statistics: Num rows: 28151027141 Data size: 
8389006088018 Basic stats: COMPLETE Column stats: COMPLETE
                      value expressions: _col15 (type: decimal(38,24))
        Reducer 3
            Reduce Operator Tree:
              Group By Operator
                aggregations: sum(VALUE._col0)
                keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint), 
KEY._col2 (type: bigint), KEY._col3 (type: bigint), KEY._col4 (type: bigint), 
KEY._col5 (type: string), KEY._col6 (type: bigint), KEY._col7 (type: bigint), 
KEY._col8 (type: bigint), KEY._col9 (type: bigint), KEY._col10 (type: bigint), 
KEY._col11 (type: bigint), KEY._col12 (type: bigint), KEY._col13 (type: 
bigint), KEY._col14 (type: bigint)
                mode: mergepartial
                outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15
                Statistics: Num rows: 28151027141 Data size: 8389006088018 
Basic stats: COMPLETE Column stats: COMPLETE
                Select Operator
                  expressions: _col0 (type: bigint), _col1 (type: bigint), 
_col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), 
UDFToLong(_col5) (type: bigint), _col6 (type: bigint), _col7 (type: bigint), 
_col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 
(type: bigint), _col12 (type: bigint), _col13 (type: bigint), 
UDFToString(_col14) (type: string), CAST( _col15 AS decimal(28,12)) (type: 
decimal(28,12))
                  outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15
                  Statistics: Num rows: 28151027141 Data size: 11260410856400 
Basic stats: COMPLETE Column stats: COMPLETE
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 28151027141 Data size: 11260410856400 
Basic stats: COMPLETE Column stats: COMPLETE
                    table:
                        input format: 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                        serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                        name: fidelity.fcap_agg_part_exp_hive_decimal_decimal

  Stage: Stage-2
    Dependency Collection

  Stage: Stage-0
    Move Operator
      tables:
          replace: false
          table:
              input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
              output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
              serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
              name: fidelity.fcap_agg_part_exp_hive_decimal_decimal

  Stage: Stage-3
    Stats-Aggr Operator
{code}

Plan #2 with fetch column stats off 
{code}STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-0 depends on stages: Stage-2
  Stage-3 depends on stages: Stage-0

STAGE PLANS:
  Stage: Stage-1
    Tez
      Edges:
        Map 1 <- Map 5 (BROADCAST_EDGE)
        Map 3 <- Map 4 (BROADCAST_EDGE)
        Map 5 <- Map 3 (BROADCAST_EDGE)
        Reducer 2 <- Map 1 (SIMPLE_EDGE)
      DagName: mmokhtar_20150113185454_7c350b7b-53fa-4bbb-a747-24ec0c2a6ab7:4
      Vertices:
        Map 1
            Map Operator Tree:
                TableScan
                  alias: driver
                  filterExpr: (((time_key_num is not null and plan_id is not 
null) and fprs_id is not null) and activity_id is not null) (type: boolean)
                  Statistics: Num rows: 2926396239 Data size: 468223398240 
Basic stats: COMPLETE Column stats: NONE
                  Filter Operator
                    predicate: (((time_key_num is not null and plan_id is not 
null) and fprs_id is not null) and activity_id is not null) (type: boolean)
                    Statistics: Num rows: 182899765 Data size: 29263962400 
Basic stats: COMPLETE Column stats: NONE
                    Select Operator
                      expressions: time_key_num (type: bigint), activity_id 
(type: bigint), plan_id (type: bigint), fprs_id (type: bigint), ssn_id (type: 
bigint), account_id (type: bigint), driver_pct (type: decimal(28,12))
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6
                      Statistics: Num rows: 182899765 Data size: 29263962400 
Basic stats: COMPLETE Column stats: NONE
                      Map Join Operator
                        condition map:
                             Inner Join 0 to 1
                        keys:
                          0 _col0 (type: bigint), _col2 (type: bigint), _col3 
(type: bigint), _col1 (type: bigint)
                          1 _col1 (type: bigint), _col8 (type: bigint), _col12 
(type: bigint), _col5 (type: bigint)
                        outputColumnNames: _col4, _col5, _col6, _col7, _col8, 
_col9, _col10, _col11, _col13, _col14, _col15, _col16, _col17, _col18, _col20, 
_col21, _col25
                        input vertices:
                          1 Map 5
                        Statistics: Num rows: 201189745 Data size: 32190359337 
Basic stats: COMPLETE Column stats: NONE
                        Select Operator
                          expressions: _col7 (type: bigint), _col8 (type: 
bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), 
_col21 (type: string), _col13 (type: bigint), _col14 (type: bigint), _col15 
(type: bigint), _col16 (type: bigint), _col17 (type: bigint), _col18 (type: 
bigint), _col25 (type: bigint), _col4 (type: bigint), _col5 (type: bigint), 
(_col20 * _col6) (type: decimal(38,24))
                          outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, 
_col15
                          Statistics: Num rows: 201189745 Data size: 
32190359337 Basic stats: COMPLETE Column stats: NONE
                          Group By Operator
                            aggregations: sum(_col15)
                            keys: _col0 (type: bigint), _col1 (type: bigint), 
_col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: 
string), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), 
_col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 
(type: bigint), _col13 (type: bigint), _col14 (type: bigint)
                            mode: hash
                            outputColumnNames: _col0, _col1, _col2, _col3, 
_col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, 
_col14, _col15
                            Statistics: Num rows: 201189745 Data size: 
32190359337 Basic stats: COMPLETE Column stats: NONE
                            Reduce Output Operator
                              key expressions: _col0 (type: bigint), _col1 
(type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: 
bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: bigint), 
_col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 
(type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 (type: 
bigint)
                              sort order: +++++++++++++++
                              Map-reduce partition columns: _col0 (type: 
bigint), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), 
_col4 (type: bigint), _col5 (type: string), _col6 (type: bigint), _col7 (type: 
bigint), _col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), 
_col11 (type: bigint), _col12 (type: bigint), _col13 (type: bigint), _col14 
(type: bigint)
                              Statistics: Num rows: 201189745 Data size: 
32190359337 Basic stats: COMPLETE Column stats: NONE
                              value expressions: _col15 (type: decimal(38,24))
            Execution mode: vectorized
        Map 3
            Map Operator Tree:
                TableScan
                  alias: balance
                  filterExpr: (((activity_id is not null and fprs_id is not 
null) and time_key_num is not null) and plan_id is not null) (type: boolean)
                  Statistics: Num rows: 6870067 Data size: 2102240502 Basic 
stats: COMPLETE Column stats: NONE
                  Filter Operator
                    predicate: (((activity_id is not null and fprs_id is not 
null) and time_key_num is not null) and plan_id is not null) (type: boolean)
                    Statistics: Num rows: 429380 Data size: 131390279 Basic 
stats: COMPLETE Column stats: NONE
                    Select Operator
                      expressions: scale (type: bigint), time_key_num (type: 
bigint), dataset_code (type: bigint), cost_center_lvl1_id (type: bigint), 
cost_pool_lvl6_id (type: bigint), activity_id (type: bigint), view_lvl1_id 
(type: bigint), from_lvl1_id (type: bigint), plan_id (type: bigint), client_id 
(type: bigint), lob_id (type: bigint), product_id (type: bigint), fprs_id 
(type: bigint), mtd_balance (type: decimal(28,12))
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13
                      Statistics: Num rows: 429380 Data size: 131390279 Basic 
stats: COMPLETE Column stats: NONE
                      Map Join Operator
                        condition map:
                             Inner Join 0 to 1
                        keys:
                          0 _col5 (type: bigint)
                          1 _col3 (type: bigint)
                        outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14
                        input vertices:
                          1 Map 4
                        Statistics: Num rows: 472318 Data size: 144529310 Basic 
stats: COMPLETE Column stats: NONE
                        Reduce Output Operator
                          key expressions: _col12 (type: bigint)
                          sort order: +
                          Map-reduce partition columns: _col12 (type: bigint)
                          Statistics: Num rows: 472318 Data size: 144529310 
Basic stats: COMPLETE Column stats: NONE
                          value expressions: _col0 (type: bigint), _col1 (type: 
bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), 
_col5 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: 
bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 (type: bigint), 
_col13 (type: decimal(28,12)), _col14 (type: string)
            Execution mode: vectorized
        Map 4
            Map Operator Tree:
                TableScan
                  alias: act
                  filterExpr: (((current_ind = 'Y') and (rollup_key = 
'TOTACT')) and activity_id is not null) (type: boolean)
                  Statistics: Num rows: 40683 Data size: 271025472 Basic stats: 
COMPLETE Column stats: NONE
                  Filter Operator
                    predicate: (((current_ind = 'Y') and (rollup_key = 
'TOTACT')) and activity_id is not null) (type: boolean)
                    Statistics: Num rows: 5085 Data size: 33875685 Basic stats: 
COMPLETE Column stats: NONE
                    Select Operator
                      expressions: lvl5_id (type: string), activity_id (type: 
bigint)
                      outputColumnNames: _col0, _col3
                      Statistics: Num rows: 5085 Data size: 33875685 Basic 
stats: COMPLETE Column stats: NONE
                      Reduce Output Operator
                        key expressions: _col3 (type: bigint)
                        sort order: +
                        Map-reduce partition columns: _col3 (type: bigint)
                        Statistics: Num rows: 5085 Data size: 33875685 Basic 
stats: COMPLETE Column stats: NONE
                        value expressions: _col0 (type: string)
            Execution mode: vectorized
        Map 5
            Map Operator Tree:
                TableScan
                  alias: fprs
                  filterExpr: ((current_ind = 'Y') and fprs_id is not null) 
(type: boolean)
                  Statistics: Num rows: 1589540 Data size: 6862044180 Basic 
stats: COMPLETE Column stats: NONE
                  Filter Operator
                    predicate: ((current_ind = 'Y') and fprs_id is not null) 
(type: boolean)
                    Statistics: Num rows: 397385 Data size: 1715511045 Basic 
stats: COMPLETE Column stats: NONE
                    Select Operator
                      expressions: lvl5_id (type: bigint), fprs_id (type: 
bigint)
                      outputColumnNames: _col0, _col2
                      Statistics: Num rows: 397385 Data size: 1715511045 Basic 
stats: COMPLETE Column stats: NONE
                      Map Join Operator
                        condition map:
                             Inner Join 0 to 1
                        keys:
                          0 _col12 (type: bigint)
                          1 _col2 (type: bigint)
                        outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, 
_col18
                        input vertices:
                          0 Map 3
                        Statistics: Num rows: 519549 Data size: 158982244 Basic 
stats: COMPLETE Column stats: NONE
                        Select Operator
                          expressions: _col0 (type: bigint), _col1 (type: 
bigint), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint), 
_col13 (type: decimal(28,12)), _col14 (type: string), _col18 (type: bigint), 
_col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: 
bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), 
_col9 (type: bigint)
                          outputColumnNames: _col0, _col1, _col10, _col11, 
_col12, _col13, _col14, _col18, _col2, _col3, _col4, _col5, _col6, _col7, 
_col8, _col9
                          Statistics: Num rows: 519549 Data size: 158982244 
Basic stats: COMPLETE Column stats: NONE
                          Reduce Output Operator
                            key expressions: _col1 (type: bigint), _col8 (type: 
bigint), _col12 (type: bigint), _col5 (type: bigint)
                            sort order: ++++
                            Map-reduce partition columns: _col1 (type: bigint), 
_col8 (type: bigint), _col12 (type: bigint), _col5 (type: bigint)
                            Statistics: Num rows: 519549 Data size: 158982244 
Basic stats: COMPLETE Column stats: NONE
                            value expressions: _col0 (type: bigint), _col2 
(type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col6 (type: 
bigint), _col7 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), 
_col11 (type: bigint), _col13 (type: decimal(28,12)), _col14 (type: string), 
_col18 (type: bigint)
            Execution mode: vectorized
        Reducer 2
            Reduce Operator Tree:
              Group By Operator
                aggregations: sum(VALUE._col0)
                keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint), 
KEY._col2 (type: bigint), KEY._col3 (type: bigint), KEY._col4 (type: bigint), 
KEY._col5 (type: string), KEY._col6 (type: bigint), KEY._col7 (type: bigint), 
KEY._col8 (type: bigint), KEY._col9 (type: bigint), KEY._col10 (type: bigint), 
KEY._col11 (type: bigint), KEY._col12 (type: bigint), KEY._col13 (type: 
bigint), KEY._col14 (type: bigint)
                mode: mergepartial
                outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15
                Statistics: Num rows: 100594872 Data size: 16095179588 Basic 
stats: COMPLETE Column stats: NONE
                Select Operator
                  expressions: _col0 (type: bigint), _col1 (type: bigint), 
_col2 (type: bigint), _col3 (type: bigint), _col4 (type: bigint), 
UDFToLong(_col5) (type: bigint), _col6 (type: bigint), _col7 (type: bigint), 
_col8 (type: bigint), _col9 (type: bigint), _col10 (type: bigint), _col11 
(type: bigint), _col12 (type: bigint), _col13 (type: bigint), 
UDFToString(_col14) (type: string), CAST( _col15 AS decimal(28,12)) (type: 
decimal(28,12))
                  outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15
                  Statistics: Num rows: 100594872 Data size: 16095179588 Basic 
stats: COMPLETE Column stats: NONE
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 100594872 Data size: 16095179588 
Basic stats: COMPLETE Column stats: NONE
                    table:
                        input format: 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                        serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                        name: fidelity.fcap_agg_part_exp_hive_decimal_decimal

  Stage: Stage-2
    Dependency Collection

  Stage: Stage-0
    Move Operator
      tables:
          replace: false
          table:
              input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
              output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
              serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
              name: fidelity.fcap_agg_part_exp_hive_decimal_decimal

  Stage: Stage-3
    Stats-Aggr Operator
{code}

Query used 
{code}
explain insert INTO table fidelity.FCAP_AGG_PART_EXP_hive_decimal_decimal 
SELECT BALANCE.SCALE, BALANCE.TIME_KEY_NUM, BALANCE.DATASET_CODE, 
BALANCE.Cost_Center_LVL1_ID, BALANCE.Cost_Pool_LVL6_ID, act.lvl5_ID, 
BALANCE.VIEW_LVL1_ID, BALANCE.FROM_LVL1_ID, BALANCE.PLAN_ID, BALANCE.CLIENT_ID, 
BALANCE.LOB_ID, BALANCE.PRODUCT_ID, fprs.lvl5_ID fprs_lvl5_id, DRIVER.SSN_ID, 
DRIVER.ACCOUNT_ID, SUM ( BALANCE.MTD_BALANCE * DRIVER.DRIVER_PCT) AS 
MTD_BALANCE FROM fidelity.FCAP_AGG_PROD_EXP_NOFUND_decimal BALANCE JOIN 
fidelity.fcap_drivers_part_exp_inter_bucket_256 DRIVER ON BALANCE.TIME_KEY_NUM 
= DRIVER.TIME_KEY_NUM AND BALANCE.PLAN_ID = DRIVER.PLAN_ID AND BALANCE.FPRS_ID 
= DRIVER.FPRS_ID AND BALANCE.ACTIVITY_ID = DRIVER.ACTIVITY_ID INNER JOIN 
fidelity.fobi_fprs_dim_mv_orc fprs ON balance.FPRS_ID = fprs.fprs_id AND 
fprs.current_ind = 'Y' inner join fidelity.fobi_activity_dim_mv act ON 
BALANCE.activity_id = act.activity_id AND act.current_ind = 'Y' AND 
act.ROLLUP_KEY = 'TOTACT' GROUP BY BALANCE.SCALE, BALANCE.TIME_KEY_NUM, 
BALANCE.DATASET_CODE, BALANCE.Cost_center_LVL1_ID, BALANCE.Cost_Pool_LVL6_ID, 
act.lvl5_ID, BALANCE.VIEW_LVL1_ID, BALANCE.FROM_LVL1_ID, BALANCE.PLAN_ID, 
BALANCE.CLIENT_ID, BALANCE.LOB_ID, BALANCE.PRODUCT_ID, fprs.lvl5_ID, 
DRIVER.SSN_ID, DRIVER.ACCOUNT_ID;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to