STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 <- Map 1 (SIMPLE_EDGE) Reducer 3 <- Reducer 2 (SIMPLE_EDGE), Reducer 6 (SIMPLE_EDGE), Reducer 8 (SIMPLE_EDGE) Reducer 4 <- Map 11 (SIMPLE_EDGE), Map 12 (BROADCAST_EDGE), Reducer 3 (SIMPLE_EDGE) Reducer 6 <- Map 5 (SIMPLE_EDGE) Reducer 8 <- Map 10 (SIMPLE_EDGE), Map 13 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE), Map 9 (SIMPLE_EDGE) DagName: zhoushugang_20150505155151_51a804b5-2e2d-49f6-a02a-5c7e626d33ea:1 Vertices: Map 1 Map Operator Tree: TableScan alias: yhd_send_message_blacklist Statistics: Num rows: 870860 Data size: 46859629 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: send_to (type: string) outputColumnNames: send_to Statistics: Num rows: 870860 Data size: 46859629 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: lower(send_to) (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 870860 Data size: 46859629 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 870860 Data size: 46859629 Basic stats: COMPLETE Column stats: NONE Map 10 Map Operator Tree: TableScan alias: t5 Statistics: Num rows: 4414357 Data size: 67807378 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: end_usr_id (type: bigint) sort order: + Map-reduce partition columns: end_usr_id (type: bigint) Statistics: Num rows: 4414357 Data size: 67807378 Basic stats: COMPLETE Column stats: NONE value expressions: car (type: string), shopping_habit (type: string) Map 11 Map Operator Tree: TableScan alias: t6 Statistics: Num rows: 38930100 Data size: 4515891712 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: user_id is not null (type: boolean) Statistics: Num rows: 19465050 Data size: 2257945856 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: email (type: string) sort order: + Map-reduce partition columns: email (type: string) Statistics: Num rows: 19465050 Data size: 2257945856 Basic stats: COMPLETE Column stats: NONE value expressions: mail_level (type: double) Map 12 Map Operator Tree: TableScan alias: t7 Statistics: Num rows: 16 Data size: 1752 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: usr_union_logon_code is not null (type: boolean) Statistics: Num rows: 8 Data size: 876 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: usr_union_logon_code (type: string) sort order: + Map-reduce partition columns: usr_union_logon_code (type: string) Statistics: Num rows: 8 Data size: 876 Basic stats: COMPLETE Column stats: NONE value expressions: usr_union_logon_id (type: bigint) Map 13 Map Operator Tree: TableScan alias: t8 Statistics: Num rows: 25660729 Data size: 3178607508 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (to_date(update_time) = '2015-05-04') (type: boolean) Statistics: Num rows: 12830364 Data size: 1589303692 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: end_user_id (type: bigint) sort order: + Map-reduce partition columns: end_user_id (type: bigint) Statistics: Num rows: 12830364 Data size: 1589303692 Basic stats: COMPLETE Column stats: NONE value expressions: exp (type: int) Map 5 Map Operator Tree: TableScan alias: edm_error_email Statistics: Num rows: 5137907 Data size: 513790784 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: email (type: string) outputColumnNames: email Statistics: Num rows: 5137907 Data size: 513790784 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: lower(email) (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 5137907 Data size: 513790784 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 5137907 Data size: 513790784 Basic stats: COMPLETE Column stats: NONE Map 7 Map Operator Tree: TableScan alias: t Statistics: Num rows: 108659208 Data size: 49700768672 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: id (type: bigint) sort order: + Map-reduce partition columns: id (type: bigint) Statistics: Num rows: 108659208 Data size: 49700768672 Basic stats: COMPLETE Column stats: NONE value expressions: end_user_name (type: string), end_user_password (type: string), end_user_real_name (type: string), end_user_birthday (type: string), end_user_last_login_date (type: string), end_user_last_bought_date (type: string), end_user_login_times (type: double), end_user_bought_amount (type: double), end_user_bought_times (type: double), end_user_sex (type: double), end_user_create_time (type: string), end_user_type (type: double), ip (type: string), end_user_points (type: double), co_code (type: string), is_email_activate (type: double), mc_site_id (type: bigint), update_time (type: string), member_grade (type: double), end_user_email (type: string), mobile (type: string), phone (type: string), valid_mobile_phone_num (type: int), id_card (type: string) Map 9 Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 11871989 Data size: 422420127 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: end_user_id (type: bigint) sort order: + Map-reduce partition columns: end_user_id (type: bigint) Statistics: Num rows: 11871989 Data size: 422420127 Basic stats: COMPLETE Column stats: NONE value expressions: user_grade (type: double), user_grade_type (type: double) Reducer 2 Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 435430 Data size: 23429814 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 435430 Data size: 23429814 Basic stats: COMPLETE Column stats: NONE Reducer 3 Reduce Operator Tree: Merge Join Operator condition map: Left Outer Join0 to 1 Left Outer Join0 to 2 keys: 0 lower(_col107) (type: string) 1 _col0 (type: string) 2 _col0 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col22, _col24, _col30, _col31, _col32, _col34, _col44, _col67, _col68, _col69, _col72, _col78, _col83, _col84, _col103, _col107, _col108, _col109, _col110, _col111, _col117, _col118, _col125, _col126, _col131, _col133, _col145, _col146 Statistics: Num rows: 788865883 Data size: 360827596199 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col107 (type: string) sort order: + Map-reduce partition columns: _col107 (type: string) Statistics: Num rows: 788865883 Data size: 360827596199 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col22 (type: string), _col24 (type: string), _col30 (type: double), _col31 (type: double), _col32 (type: double), _col34 (type: double), _col44 (type: string), _col67 (type: double), _col68 (type: string), _col69 (type: double), _col72 (type: string), _col78 (type: double), _col83 (type: bigint), _col84 (type: string), _col103 (type: double), _col108 (type: string), _col109 (type: string), _col110 (type: int), _col111 (type: string), _col117 (type: double), _col118 (type: double), _col125 (type: string), _col126 (type: string), _col131 (type: bigint), _col133 (type: int), _col145 (type: string), _col146 (type: string) Reducer 4 Reduce Operator Tree: Merge Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col107 (type: string) 1 email (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col22, _col24, _col30, _col31, _col32, _col34, _col44, _col67, _col68, _col69, _col72, _col78, _col83, _col84, _col103, _col107, _col108, _col109, _col110, _col111, _col117, _col118, _col125, _col126, _col131, _col133, _col145, _col146, _col157 Statistics: Num rows: 867752490 Data size: 396910364421 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col72 (type: string) 1 usr_union_logon_code (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col22, _col24, _col30, _col31, _col32, _col34, _col44, _col67, _col68, _col69, _col72, _col78, _col83, _col84, _col103, _col107, _col108, _col109, _col110, _col111, _col117, _col118, _col125, _col126, _col131, _col133, _col145, _col146, _col157, _col163 input vertices: 1 Map 12 Statistics: Num rows: 954527759 Data size: 436601410326 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((_col84 >= '2015-05-04') or (length(_col131) > 0)) (type: boolean) Statistics: Num rows: 636351838 Data size: 291067606274 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: bigint), _col1 (type: string), _col3 (type: string), _col2 (type: string), _col4 (type: string), _col108 (type: string), _col109 (type: string), _col110 (type: int), _col107 (type: string), substr(_col107, instr(_col107'@'), (instr(_col107'.') - instr(_col107'@'))) (type: string), _col44 (type: string), _col22 (type: string), _col24 (type: string), _col30 (type: double), _col32 (type: double), _col31 (type: double), _col67 (type: double), _col68 (type: string), _col69 (type: double), _col111 (type: string), _col34 (type: double), _col83 (type: bigint), CASE WHEN ((_col78 = 1.0)) THEN (1) ELSE (0) END (type: int), CASE WHEN (_col110 is not null) THEN (1) ELSE (0) END (type: int), CASE WHEN ((_col1 like '%@sina%')) THEN (1) WHEN ((_col1 like '%@pingan%')) THEN (2) WHEN (((_col1 like '%@alipay%') and (not ((_col117) IN (1, 2) and (_col118 = 1.0))))) THEN (3) WHEN ((((_col1 like '%@alipay%') and (_col117) IN (1, 2)) and (_col118 = 1.0))) THEN (4) WHEN ((_col1 like '%@163%')) THEN (5) WHEN ((_col1 like '%@kaixin001%')) THEN (6) WHEN ((_col1 like '%@139%')) THEN (7) WHEN ((_col1 like '%@msn.com%')) THEN (8) WHEN ((_col1 like '%@anyue%')) THEN (9) WHEN ((_col1 like '%@qq%')) THEN (10) ELSE (0) END (type: int), CASE WHEN (((_col1 like '%@b2b%') or (_col1 = 'yuxiaolan'))) THEN (1) ELSE (0) END (type: int), CASE WHEN (_col145 is not null) THEN (1) ELSE (0) END (type: int), CASE WHEN (_col146 is not null) THEN (1) ELSE (0) END (type: int), CASE WHEN ((_col125 = '1')) THEN (1) ELSE (0) END (type: int), _col126 (type: string), _col157 (type: double), '' (type: string), '' (type: string), '' (type: string), '' (type: string), _col103 (type: double), CASE WHEN (_col72 is null) THEN (null) WHEN ((_col72 is not null and _col163 is null)) THEN (-999999) ELSE (_col163) END (type: bigint), _col72 (type: string), _col133 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24, _col25, _col26, _col27, _col28, _col29, _col30, _col31, _col32, _col33, _col34, _col35, _col36, _col37, _col38 Statistics: Num rows: 636351838 Data size: 291067606274 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 636351838 Data size: 291067606274 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Reducer 6 Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 2568953 Data size: 256895341 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 2568953 Data size: 256895341 Basic stats: COMPLETE Column stats: NONE Reducer 8 Reduce Operator Tree: Merge Join Operator condition map: Left Outer Join0 to 1 Left Outer Join0 to 2 Left Outer Join0 to 3 keys: 0 id (type: bigint) 1 end_user_id (type: bigint) 2 end_usr_id (type: bigint) 3 end_user_id (type: bigint) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col22, _col24, _col30, _col31, _col32, _col34, _col44, _col67, _col68, _col69, _col72, _col78, _col83, _col84, _col103, _col107, _col108, _col109, _col110, _col111, _col117, _col118, _col125, _col126, _col131, _col133 Statistics: Num rows: 358575394 Data size: 164012540172 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: lower(_col107) (type: string) sort order: + Map-reduce partition columns: lower(_col107) (type: string) Statistics: Num rows: 358575394 Data size: 164012540172 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col22 (type: string), _col24 (type: string), _col30 (type: double), _col31 (type: double), _col32 (type: double), _col34 (type: double), _col44 (type: string), _col67 (type: double), _col68 (type: string), _col69 (type: double), _col72 (type: string), _col78 (type: double), _col83 (type: bigint), _col84 (type: string), _col103 (type: double), _col107 (type: string), _col108 (type: string), _col109 (type: string), _col110 (type: int), _col111 (type: string), _col117 (type: double), _col118 (type: double), _col125 (type: string), _col126 (type: string), _col131 (type: bigint), _col133 (type: int) Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink