[ https://issues.apache.org/jira/browse/HIVE-9713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mostafa Mokhtar updated HIVE-9713: ---------------------------------- Description: For the query below which is a subset of TPC-DS Query 80, CBO joins catalog_sales with catalog_returns first although the CE of the join is relatively high. catalog_sales should be joined with the selective dimension tables first. {code} select count(*) from catalog_sales ,warehouse ,date_dim ,time_dim ,ship_mode where catalog_sales.cs_warehouse_sk = warehouse.w_warehouse_sk and catalog_sales.cs_sold_date_sk = date_dim.d_date_sk and catalog_sales.cs_sold_time_sk = time_dim.t_time_sk and catalog_sales.cs_ship_mode_sk = ship_mode.sm_ship_mode_sk and d_year = 2002 and t_time between 49530 AND 49530+28800 and sm_carrier in ('DIAMOND','AIRBORNE') group by w_warehouse_name ,w_warehouse_sq_ft ,w_city ,w_county ,w_state ,w_country ,d_year {code} Logical plan from CBO debug logs {code} 2015-02-17 22:34:04,577 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(743)) - Plan After Join Reordering: HiveProject(catalog_page_id=[$0], sales=[$1], returns=[$2], profit=[$3]): rowcount = 10590.0, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1395 HiveAggregate(group=[{0}], agg#0=[sum($1)], agg#1=[sum($2)], agg#2=[sum($3)]): rowcount = 10590.0, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1393 HiveProject($f0=[$14], $f1=[$5], $f2=[coalesce($9, 0)], $f3=[-($6, coalesce($10, 0))]): rowcount = 1.368586152225262E8, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1391 HiveJoin(condition=[=($3, $17)], joinType=[inner]): rowcount = 1.368586152225262E8, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1508 HiveJoin(condition=[=($2, $15)], joinType=[inner]): rowcount = 2.737172304450524E8, cumulative cost = {8.252425594517495E15 rows, 0.0 cpu, 0.0 io}, id = 1506 HiveJoin(condition=[=($1, $13)], joinType=[inner]): rowcount = 8.211516913351573E8, cumulative cost = {8.252424773349804E15 rows, 0.0 cpu, 0.0 io}, id = 1504 HiveJoin(condition=[=($0, $11)], joinType=[inner]): rowcount = 1.1296953399027347E11, cumulative cost = {8.252311803804096E15 rows, 0.0 cpu, 0.0 io}, id = 1418 HiveJoin(condition=[AND(=($2, $7), =($4, $8))], joinType=[left]): rowcount = 8.252311488455487E15, cumulative cost = {3.15348608E8 rows, 0.0 cpu, 0.0 io}, id = 1413 HiveProject(cs_sold_date_sk=[$0], cs_catalog_page_sk=[$12], cs_item_sk=[$15], cs_promo_sk=[$16], cs_order_number=[$17], cs_ext_sales_price=[$23], cs_net_profit=[$33]): rowcount = 2.86549727E8, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1324 HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]]): rowcount = 2.86549727E8, cumulative cost = {0}, id = 1136 HiveProject(cr_item_sk=[$2], cr_order_number=[$16], cr_return_amount=[$18], cr_net_loss=[$26]): rowcount = 2.8798881E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1327 HiveTableScan(table=[[tpcds_bin_orc_200.catalog_returns]]): rowcount = 2.8798881E7, cumulative cost = {0}, id = 1137 HiveProject(d_date_sk=[$0], d_date=[$2]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1371 HiveFilter(condition=[between(false, $2, CAST('1998-08-04'):DATE, CAST('1998-09-04'):DATE)]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1369 HiveTableScan(table=[[tpcds_bin_orc_200.date_dim]]): rowcount = 73049.0, cumulative cost = {0}, id = 1138 HiveProject(cp_catalog_page_sk=[$0], cp_catalog_page_id=[$1]): rowcount = 11718.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1375 HiveTableScan(table=[[tpcds_bin_orc_200.catalog_page]]): rowcount = 11718.0, cumulative cost = {0}, id = 1139 HiveProject(i_item_sk=[$0], i_current_price=[$5]): rowcount = 16000.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1381 HiveFilter(condition=[>($5, 5E1)]): rowcount = 16000.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1379 HiveTableScan(table=[[tpcds_bin_orc_200.item]]): rowcount = 48000.0, cumulative cost = {0}, id = 1140 HiveProject(p_promo_sk=[$0], p_channel_tv=[$11]): rowcount = 225.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1387 HiveFilter(condition=[=($11, 'N')]): rowcount = 225.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1385 HiveTableScan(table=[[tpcds_bin_orc_200.promotion]]): rowcount = 450.0, cumulative cost = {0}, id = 1141 {code} Explain plan {code} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 <- Map 2 (BROADCAST_EDGE) Map 3 <- Map 1 (BROADCAST_EDGE) Map 4 <- Map 3 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE) Reducer 5 <- Map 4 (SIMPLE_EDGE) DagName: mmokhtar_20150217223434_d0ab6fa9-a1a3-47a5-8138-ba7435d9aea5:4 Vertices: Map 1 Map Operator Tree: TableScan alias: catalog_sales filterExpr: (((cs_sold_date_sk is not null and cs_catalog_page_sk is not null) and cs_item_sk is not null) and cs_promo_sk is not null) (type: boolean) Statistics: Num rows: 286549727 Data size: 65825832570 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((cs_sold_date_sk is not null and cs_catalog_page_sk is not null) and cs_item_sk is not null) and cs_promo_sk is not null) (type: boolean) Statistics: Num rows: 285112475 Data size: 7974560516 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: cs_sold_date_sk (type: int), cs_catalog_page_sk (type: int), cs_item_sk (type: int), cs_promo_sk (type: int), cs_order_number (type: int), cs_ext_sales_price (type: float), cs_net_profit (type: float) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 Statistics: Num rows: 285112475 Data size: 7974560516 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col2 (type: int), _col4 (type: int) 1 _col0 (type: int), _col1 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6, _col9, _col10 input vertices: 1 Map 2 Statistics: Num rows: 2911 Data size: 93152 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 2911 Data size: 93152 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col5 (type: float), _col6 (type: float), _col9 (type: float), _col10 (type: float) Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: catalog_returns filterExpr: cr_item_sk is not null (type: boolean) Statistics: Num rows: 28798881 Data size: 5764329494 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: cr_item_sk is not null (type: boolean) Statistics: Num rows: 28798881 Data size: 456171072 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: cr_item_sk (type: int), cr_order_number (type: int), cr_return_amount (type: float), cr_net_loss (type: float) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 28798881 Data size: 456171072 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int), _col1 (type: int) sort order: ++ Map-reduce partition columns: _col0 (type: int), _col1 (type: int) Statistics: Num rows: 28798881 Data size: 456171072 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: float), _col3 (type: float) Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: date_dim filterExpr: (d_date BETWEEN 1998-08-04 AND 1998-09-04 and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (d_date BETWEEN 1998-08-04 AND 1998-09-04 and d_date_sk is not null) (type: boolean) Statistics: Num rows: 36524 Data size: 3579352 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col0 (type: int) 1 _col0 (type: int) outputColumnNames: _col1, _col2, _col3, _col5, _col6, _col9, _col10 input vertices: 0 Map 1 Statistics: Num rows: 1456 Data size: 40768 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) Statistics: Num rows: 1456 Data size: 40768 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: int), _col3 (type: int), _col5 (type: float), _col6 (type: float), _col9 (type: float), _col10 (type: float) Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: catalog_page filterExpr: cp_catalog_page_sk is not null (type: boolean) Statistics: Num rows: 11718 Data size: 5400282 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: cp_catalog_page_sk is not null (type: boolean) Statistics: Num rows: 11718 Data size: 1218672 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: cp_catalog_page_sk (type: int), cp_catalog_page_id (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 11718 Data size: 1218672 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: int) 1 _col0 (type: int) outputColumnNames: _col2, _col3, _col5, _col6, _col9, _col10, _col14 input vertices: 0 Map 3 Statistics: Num rows: 1456 Data size: 180544 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col2 (type: int) 1 _col0 (type: int) outputColumnNames: _col3, _col5, _col6, _col9, _col10, _col14 input vertices: 1 Map 6 Statistics: Num rows: 486 Data size: 58320 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col3 (type: int) 1 _col0 (type: int) outputColumnNames: _col5, _col6, _col9, _col10, _col14 input vertices: 1 Map 7 Statistics: Num rows: 243 Data size: 28188 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col14 (type: string), _col5 (type: float), COALESCE(_col9,0) (type: float), (_col6 - COALESCE(_col10,0)) (type: float) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 243 Data size: 28188 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: sum(_col1), sum(_col2), sum(_col3) keys: _col0 (type: string) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: double), _col2 (type: double), _col3 (type: double) Execution mode: vectorized Map 6 Map Operator Tree: TableScan alias: item filterExpr: ((i_current_price > 50.0) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 48000 Data size: 68732712 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((i_current_price > 50.0) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 16000 Data size: 127832 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: i_item_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 16000 Data size: 64000 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 16000 Data size: 64000 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Map 7 Map Operator Tree: TableScan alias: promotion filterExpr: ((p_channel_tv = 'N') and p_promo_sk is not null) (type: boolean) Statistics: Num rows: 450 Data size: 530848 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((p_channel_tv = 'N') and p_promo_sk is not null) (type: boolean) Statistics: Num rows: 225 Data size: 20025 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: p_promo_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 225 Data size: 900 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 225 Data size: 900 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Reducer 5 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0), sum(VALUE._col1), sum(VALUE._col2) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} was: For the query below which is a subset of TPC-DS Query 66, CBO joins catalog_sales with catalog_returns first although the CE of the join is relatively high. catalog_sales should be joined with the selective dimension tables first. {code} select count(*) from catalog_sales ,warehouse ,date_dim ,time_dim ,ship_mode where catalog_sales.cs_warehouse_sk = warehouse.w_warehouse_sk and catalog_sales.cs_sold_date_sk = date_dim.d_date_sk and catalog_sales.cs_sold_time_sk = time_dim.t_time_sk and catalog_sales.cs_ship_mode_sk = ship_mode.sm_ship_mode_sk and d_year = 2002 and t_time between 49530 AND 49530+28800 and sm_carrier in ('DIAMOND','AIRBORNE') group by w_warehouse_name ,w_warehouse_sq_ft ,w_city ,w_county ,w_state ,w_country ,d_year {code} Logical plan from CBO debug logs {code} 2015-02-17 22:34:04,577 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(743)) - Plan After Join Reordering: HiveProject(catalog_page_id=[$0], sales=[$1], returns=[$2], profit=[$3]): rowcount = 10590.0, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1395 HiveAggregate(group=[{0}], agg#0=[sum($1)], agg#1=[sum($2)], agg#2=[sum($3)]): rowcount = 10590.0, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1393 HiveProject($f0=[$14], $f1=[$5], $f2=[coalesce($9, 0)], $f3=[-($6, coalesce($10, 0))]): rowcount = 1.368586152225262E8, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1391 HiveJoin(condition=[=($3, $17)], joinType=[inner]): rowcount = 1.368586152225262E8, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1508 HiveJoin(condition=[=($2, $15)], joinType=[inner]): rowcount = 2.737172304450524E8, cumulative cost = {8.252425594517495E15 rows, 0.0 cpu, 0.0 io}, id = 1506 HiveJoin(condition=[=($1, $13)], joinType=[inner]): rowcount = 8.211516913351573E8, cumulative cost = {8.252424773349804E15 rows, 0.0 cpu, 0.0 io}, id = 1504 HiveJoin(condition=[=($0, $11)], joinType=[inner]): rowcount = 1.1296953399027347E11, cumulative cost = {8.252311803804096E15 rows, 0.0 cpu, 0.0 io}, id = 1418 HiveJoin(condition=[AND(=($2, $7), =($4, $8))], joinType=[left]): rowcount = 8.252311488455487E15, cumulative cost = {3.15348608E8 rows, 0.0 cpu, 0.0 io}, id = 1413 HiveProject(cs_sold_date_sk=[$0], cs_catalog_page_sk=[$12], cs_item_sk=[$15], cs_promo_sk=[$16], cs_order_number=[$17], cs_ext_sales_price=[$23], cs_net_profit=[$33]): rowcount = 2.86549727E8, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1324 HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]]): rowcount = 2.86549727E8, cumulative cost = {0}, id = 1136 HiveProject(cr_item_sk=[$2], cr_order_number=[$16], cr_return_amount=[$18], cr_net_loss=[$26]): rowcount = 2.8798881E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1327 HiveTableScan(table=[[tpcds_bin_orc_200.catalog_returns]]): rowcount = 2.8798881E7, cumulative cost = {0}, id = 1137 HiveProject(d_date_sk=[$0], d_date=[$2]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1371 HiveFilter(condition=[between(false, $2, CAST('1998-08-04'):DATE, CAST('1998-09-04'):DATE)]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1369 HiveTableScan(table=[[tpcds_bin_orc_200.date_dim]]): rowcount = 73049.0, cumulative cost = {0}, id = 1138 HiveProject(cp_catalog_page_sk=[$0], cp_catalog_page_id=[$1]): rowcount = 11718.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1375 HiveTableScan(table=[[tpcds_bin_orc_200.catalog_page]]): rowcount = 11718.0, cumulative cost = {0}, id = 1139 HiveProject(i_item_sk=[$0], i_current_price=[$5]): rowcount = 16000.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1381 HiveFilter(condition=[>($5, 5E1)]): rowcount = 16000.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1379 HiveTableScan(table=[[tpcds_bin_orc_200.item]]): rowcount = 48000.0, cumulative cost = {0}, id = 1140 HiveProject(p_promo_sk=[$0], p_channel_tv=[$11]): rowcount = 225.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1387 HiveFilter(condition=[=($11, 'N')]): rowcount = 225.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1385 HiveTableScan(table=[[tpcds_bin_orc_200.promotion]]): rowcount = 450.0, cumulative cost = {0}, id = 1141 {code} Explain plan {code} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 <- Map 2 (BROADCAST_EDGE) Map 3 <- Map 1 (BROADCAST_EDGE) Map 4 <- Map 3 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE) Reducer 5 <- Map 4 (SIMPLE_EDGE) DagName: mmokhtar_20150217223434_d0ab6fa9-a1a3-47a5-8138-ba7435d9aea5:4 Vertices: Map 1 Map Operator Tree: TableScan alias: catalog_sales filterExpr: (((cs_sold_date_sk is not null and cs_catalog_page_sk is not null) and cs_item_sk is not null) and cs_promo_sk is not null) (type: boolean) Statistics: Num rows: 286549727 Data size: 65825832570 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((cs_sold_date_sk is not null and cs_catalog_page_sk is not null) and cs_item_sk is not null) and cs_promo_sk is not null) (type: boolean) Statistics: Num rows: 285112475 Data size: 7974560516 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: cs_sold_date_sk (type: int), cs_catalog_page_sk (type: int), cs_item_sk (type: int), cs_promo_sk (type: int), cs_order_number (type: int), cs_ext_sales_price (type: float), cs_net_profit (type: float) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 Statistics: Num rows: 285112475 Data size: 7974560516 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col2 (type: int), _col4 (type: int) 1 _col0 (type: int), _col1 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6, _col9, _col10 input vertices: 1 Map 2 Statistics: Num rows: 2911 Data size: 93152 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 2911 Data size: 93152 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col5 (type: float), _col6 (type: float), _col9 (type: float), _col10 (type: float) Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: catalog_returns filterExpr: cr_item_sk is not null (type: boolean) Statistics: Num rows: 28798881 Data size: 5764329494 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: cr_item_sk is not null (type: boolean) Statistics: Num rows: 28798881 Data size: 456171072 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: cr_item_sk (type: int), cr_order_number (type: int), cr_return_amount (type: float), cr_net_loss (type: float) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 28798881 Data size: 456171072 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int), _col1 (type: int) sort order: ++ Map-reduce partition columns: _col0 (type: int), _col1 (type: int) Statistics: Num rows: 28798881 Data size: 456171072 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: float), _col3 (type: float) Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: date_dim filterExpr: (d_date BETWEEN 1998-08-04 AND 1998-09-04 and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (d_date BETWEEN 1998-08-04 AND 1998-09-04 and d_date_sk is not null) (type: boolean) Statistics: Num rows: 36524 Data size: 3579352 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col0 (type: int) 1 _col0 (type: int) outputColumnNames: _col1, _col2, _col3, _col5, _col6, _col9, _col10 input vertices: 0 Map 1 Statistics: Num rows: 1456 Data size: 40768 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) Statistics: Num rows: 1456 Data size: 40768 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: int), _col3 (type: int), _col5 (type: float), _col6 (type: float), _col9 (type: float), _col10 (type: float) Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: catalog_page filterExpr: cp_catalog_page_sk is not null (type: boolean) Statistics: Num rows: 11718 Data size: 5400282 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: cp_catalog_page_sk is not null (type: boolean) Statistics: Num rows: 11718 Data size: 1218672 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: cp_catalog_page_sk (type: int), cp_catalog_page_id (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 11718 Data size: 1218672 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: int) 1 _col0 (type: int) outputColumnNames: _col2, _col3, _col5, _col6, _col9, _col10, _col14 input vertices: 0 Map 3 Statistics: Num rows: 1456 Data size: 180544 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col2 (type: int) 1 _col0 (type: int) outputColumnNames: _col3, _col5, _col6, _col9, _col10, _col14 input vertices: 1 Map 6 Statistics: Num rows: 486 Data size: 58320 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col3 (type: int) 1 _col0 (type: int) outputColumnNames: _col5, _col6, _col9, _col10, _col14 input vertices: 1 Map 7 Statistics: Num rows: 243 Data size: 28188 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col14 (type: string), _col5 (type: float), COALESCE(_col9,0) (type: float), (_col6 - COALESCE(_col10,0)) (type: float) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 243 Data size: 28188 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: sum(_col1), sum(_col2), sum(_col3) keys: _col0 (type: string) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: double), _col2 (type: double), _col3 (type: double) Execution mode: vectorized Map 6 Map Operator Tree: TableScan alias: item filterExpr: ((i_current_price > 50.0) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 48000 Data size: 68732712 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((i_current_price > 50.0) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 16000 Data size: 127832 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: i_item_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 16000 Data size: 64000 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 16000 Data size: 64000 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Map 7 Map Operator Tree: TableScan alias: promotion filterExpr: ((p_channel_tv = 'N') and p_promo_sk is not null) (type: boolean) Statistics: Num rows: 450 Data size: 530848 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((p_channel_tv = 'N') and p_promo_sk is not null) (type: boolean) Statistics: Num rows: 225 Data size: 20025 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: p_promo_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 225 Data size: 900 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 225 Data size: 900 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Reducer 5 Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0), sum(VALUE._col1), sum(VALUE._col2) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false Statistics: Num rows: 121 Data size: 15004 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} > CBO : inefficient join order created for left join outer condition > ------------------------------------------------------------------ > > Key: HIVE-9713 > URL: https://issues.apache.org/jira/browse/HIVE-9713 > Project: Hive > Issue Type: Bug > Components: CBO > Affects Versions: 0.14.0 > Reporter: Mostafa Mokhtar > Assignee: Laljo John Pullokkaran > Fix For: 1.2.0 > > > For the query below which is a subset of TPC-DS Query 80, CBO joins > catalog_sales with catalog_returns first although the CE of the join is > relatively high. > catalog_sales should be joined with the selective dimension tables first. > {code} > select count(*) > from > catalog_sales > ,warehouse > ,date_dim > ,time_dim > ,ship_mode > where > catalog_sales.cs_warehouse_sk = warehouse.w_warehouse_sk > and catalog_sales.cs_sold_date_sk = date_dim.d_date_sk > and catalog_sales.cs_sold_time_sk = time_dim.t_time_sk > and catalog_sales.cs_ship_mode_sk = ship_mode.sm_ship_mode_sk > and d_year = 2002 > and t_time between 49530 AND 49530+28800 > and sm_carrier in ('DIAMOND','AIRBORNE') > group by > w_warehouse_name > ,w_warehouse_sq_ft > ,w_city > ,w_county > ,w_state > ,w_country > ,d_year > {code} > Logical plan from CBO debug logs > {code} > 2015-02-17 22:34:04,577 DEBUG [main]: parse.CalcitePlanner > (CalcitePlanner.java:apply(743)) - Plan After Join Reordering: > HiveProject(catalog_page_id=[$0], sales=[$1], returns=[$2], profit=[$3]): > rowcount = 10590.0, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, 0.0 > io}, id = 1395 > HiveAggregate(group=[{0}], agg#0=[sum($1)], agg#1=[sum($2)], > agg#2=[sum($3)]): rowcount = 10590.0, cumulative cost = {8.25242586823495E15 > rows, 0.0 cpu, 0.0 io}, id = 1393 > HiveProject($f0=[$14], $f1=[$5], $f2=[coalesce($9, 0)], $f3=[-($6, > coalesce($10, 0))]): rowcount = 1.368586152225262E8, cumulative cost = > {8.25242586823495E15 rows, 0.0 cpu, 0.0 io}, id = 1391 > HiveJoin(condition=[=($3, $17)], joinType=[inner]): rowcount = > 1.368586152225262E8, cumulative cost = {8.25242586823495E15 rows, 0.0 cpu, > 0.0 io}, id = 1508 > HiveJoin(condition=[=($2, $15)], joinType=[inner]): rowcount = > 2.737172304450524E8, cumulative cost = {8.252425594517495E15 rows, 0.0 cpu, > 0.0 io}, id = 1506 > HiveJoin(condition=[=($1, $13)], joinType=[inner]): rowcount = > 8.211516913351573E8, cumulative cost = {8.252424773349804E15 rows, 0.0 cpu, > 0.0 io}, id = 1504 > HiveJoin(condition=[=($0, $11)], joinType=[inner]): rowcount = > 1.1296953399027347E11, cumulative cost = {8.252311803804096E15 rows, 0.0 cpu, > 0.0 io}, id = 1418 > HiveJoin(condition=[AND(=($2, $7), =($4, $8))], > joinType=[left]): rowcount = 8.252311488455487E15, cumulative cost = > {3.15348608E8 rows, 0.0 cpu, 0.0 io}, id = 1413 > HiveProject(cs_sold_date_sk=[$0], cs_catalog_page_sk=[$12], > cs_item_sk=[$15], cs_promo_sk=[$16], cs_order_number=[$17], > cs_ext_sales_price=[$23], cs_net_profit=[$33]): rowcount = 2.86549727E8, > cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1324 > HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]]): > rowcount = 2.86549727E8, cumulative cost = {0}, id = 1136 > HiveProject(cr_item_sk=[$2], cr_order_number=[$16], > cr_return_amount=[$18], cr_net_loss=[$26]): rowcount = 2.8798881E7, > cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1327 > HiveTableScan(table=[[tpcds_bin_orc_200.catalog_returns]]): > rowcount = 2.8798881E7, cumulative cost = {0}, id = 1137 > HiveProject(d_date_sk=[$0], d_date=[$2]): rowcount = 1.0, > cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1371 > HiveFilter(condition=[between(false, $2, > CAST('1998-08-04'):DATE, CAST('1998-09-04'):DATE)]): rowcount = 1.0, > cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1369 > HiveTableScan(table=[[tpcds_bin_orc_200.date_dim]]): > rowcount = 73049.0, cumulative cost = {0}, id = 1138 > HiveProject(cp_catalog_page_sk=[$0], cp_catalog_page_id=[$1]): > rowcount = 11718.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1375 > HiveTableScan(table=[[tpcds_bin_orc_200.catalog_page]]): > rowcount = 11718.0, cumulative cost = {0}, id = 1139 > HiveProject(i_item_sk=[$0], i_current_price=[$5]): rowcount = > 16000.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1381 > HiveFilter(condition=[>($5, 5E1)]): rowcount = 16000.0, > cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1379 > HiveTableScan(table=[[tpcds_bin_orc_200.item]]): rowcount = > 48000.0, cumulative cost = {0}, id = 1140 > HiveProject(p_promo_sk=[$0], p_channel_tv=[$11]): rowcount = 225.0, > cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1387 > HiveFilter(condition=[=($11, 'N')]): rowcount = 225.0, cumulative > cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1385 > HiveTableScan(table=[[tpcds_bin_orc_200.promotion]]): rowcount = > 450.0, cumulative cost = {0}, id = 1141 > {code} > Explain plan > {code} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 1 <- Map 2 (BROADCAST_EDGE) > Map 3 <- Map 1 (BROADCAST_EDGE) > Map 4 <- Map 3 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 > (BROADCAST_EDGE) > Reducer 5 <- Map 4 (SIMPLE_EDGE) > DagName: mmokhtar_20150217223434_d0ab6fa9-a1a3-47a5-8138-ba7435d9aea5:4 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: catalog_sales > filterExpr: (((cs_sold_date_sk is not null and > cs_catalog_page_sk is not null) and cs_item_sk is not null) and cs_promo_sk > is not null) (type: boolean) > Statistics: Num rows: 286549727 Data size: 65825832570 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((cs_sold_date_sk is not null and > cs_catalog_page_sk is not null) and cs_item_sk is not null) and cs_promo_sk > is not null) (type: boolean) > Statistics: Num rows: 285112475 Data size: 7974560516 > Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: cs_sold_date_sk (type: int), > cs_catalog_page_sk (type: int), cs_item_sk (type: int), cs_promo_sk (type: > int), cs_order_number (type: int), cs_ext_sales_price (type: float), > cs_net_profit (type: float) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6 > Statistics: Num rows: 285112475 Data size: 7974560516 > Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Left Outer Join0 to 1 > keys: > 0 _col2 (type: int), _col4 (type: int) > 1 _col0 (type: int), _col1 (type: int) > outputColumnNames: _col0, _col1, _col2, _col3, _col5, > _col6, _col9, _col10 > input vertices: > 1 Map 2 > Statistics: Num rows: 2911 Data size: 93152 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 2911 Data size: 93152 Basic > stats: COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: int), _col2 (type: > int), _col3 (type: int), _col5 (type: float), _col6 (type: float), _col9 > (type: float), _col10 (type: float) > Execution mode: vectorized > Map 2 > Map Operator Tree: > TableScan > alias: catalog_returns > filterExpr: cr_item_sk is not null (type: boolean) > Statistics: Num rows: 28798881 Data size: 5764329494 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: cr_item_sk is not null (type: boolean) > Statistics: Num rows: 28798881 Data size: 456171072 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: cr_item_sk (type: int), cr_order_number > (type: int), cr_return_amount (type: float), cr_net_loss (type: float) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 28798881 Data size: 456171072 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int), _col1 (type: int) > sort order: ++ > Map-reduce partition columns: _col0 (type: int), > _col1 (type: int) > Statistics: Num rows: 28798881 Data size: 456171072 > Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col2 (type: float), _col3 (type: > float) > Execution mode: vectorized > Map 3 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_date BETWEEN 1998-08-04 AND 1998-09-04 and > d_date_sk is not null) (type: boolean) > Statistics: Num rows: 73049 Data size: 81741831 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_date BETWEEN 1998-08-04 AND 1998-09-04 and > d_date_sk is not null) (type: boolean) > Statistics: Num rows: 36524 Data size: 3579352 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: d_date_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 36524 Data size: 146096 Basic > stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col1, _col2, _col3, _col5, _col6, > _col9, _col10 > input vertices: > 0 Map 1 > Statistics: Num rows: 1456 Data size: 40768 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col1 (type: int) > sort order: + > Map-reduce partition columns: _col1 (type: int) > Statistics: Num rows: 1456 Data size: 40768 Basic > stats: COMPLETE Column stats: COMPLETE > value expressions: _col2 (type: int), _col3 (type: > int), _col5 (type: float), _col6 (type: float), _col9 (type: float), _col10 > (type: float) > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: catalog_page > filterExpr: cp_catalog_page_sk is not null (type: boolean) > Statistics: Num rows: 11718 Data size: 5400282 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: cp_catalog_page_sk is not null (type: boolean) > Statistics: Num rows: 11718 Data size: 1218672 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: cp_catalog_page_sk (type: int), > cp_catalog_page_id (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 11718 Data size: 1218672 Basic > stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col1 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col2, _col3, _col5, _col6, _col9, > _col10, _col14 > input vertices: > 0 Map 3 > Statistics: Num rows: 1456 Data size: 180544 Basic > stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col2 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col3, _col5, _col6, _col9, > _col10, _col14 > input vertices: > 1 Map 6 > Statistics: Num rows: 486 Data size: 58320 Basic > stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col3 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col5, _col6, _col9, _col10, > _col14 > input vertices: > 1 Map 7 > Statistics: Num rows: 243 Data size: 28188 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col14 (type: string), _col5 > (type: float), COALESCE(_col9,0) (type: float), (_col6 - COALESCE(_col10,0)) > (type: float) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 243 Data size: 28188 > Basic stats: COMPLETE Column stats: COMPLETE > Group By Operator > aggregations: sum(_col1), sum(_col2), > sum(_col3) > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 121 Data size: 15004 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: string) > sort order: + > Map-reduce partition columns: _col0 (type: > string) > Statistics: Num rows: 121 Data size: 15004 > Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: double), > _col2 (type: double), _col3 (type: double) > Execution mode: vectorized > Map 6 > Map Operator Tree: > TableScan > alias: item > filterExpr: ((i_current_price > 50.0) and i_item_sk is not > null) (type: boolean) > Statistics: Num rows: 48000 Data size: 68732712 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((i_current_price > 50.0) and i_item_sk is not > null) (type: boolean) > Statistics: Num rows: 16000 Data size: 127832 Basic > stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: i_item_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 16000 Data size: 64000 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 16000 Data size: 64000 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 7 > Map Operator Tree: > TableScan > alias: promotion > filterExpr: ((p_channel_tv = 'N') and p_promo_sk is not > null) (type: boolean) > Statistics: Num rows: 450 Data size: 530848 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((p_channel_tv = 'N') and p_promo_sk is not > null) (type: boolean) > Statistics: Num rows: 225 Data size: 20025 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: p_promo_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 225 Data size: 900 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 225 Data size: 900 Basic stats: > COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Reducer 5 > Reduce Operator Tree: > Group By Operator > aggregations: sum(VALUE._col0), sum(VALUE._col1), > sum(VALUE._col2) > keys: KEY._col0 (type: string) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 121 Data size: 15004 Basic stats: > COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 121 Data size: 15004 Basic stats: > COMPLETE Column stats: COMPLETE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)