*Some properties on hive-site.xml is* <property> <name>hive.ignore.mapjoin.hint</name> <value>false</value> </property> <property> <name>hive.auto.convert.join</name> <value>true</value> </property> <property> <name>hive.auto.convert.join.noconditionaltask</name> <value>true</value>
*If more information is required,please let us know.* *Thanks.* 2016-01-27 15:20 GMT+08:00 Jone Zhang <joyoungzh...@gmail.com>: > *I have run a query many times, there will be two results without regular.* > *One is 36834699 and other is 18464706.* > > *The query is * > set spark.yarn.queue=soft.high; > set hive.execution.engine=spark; > select /*+mapjoin(t3,t4,t5)*/ > count(1) > from > ( > select > coalesce(t11.qua,t12.qua,t13.qua) qua, > coalesce(t11.scene,t12.lanmu_id,t13.lanmu_id) scene, > coalesce(t11.app_id,t12.appid,t13.app_id) app_id, > expos_pv, > expos_uv, > dload_pv, > dload_uv, > dload_cnt, > dload_user, > evil_dload_cnt, > evil_dload_user, > update_dcnt, > update_duser, > hand_suc_incnt, > hand_suc_inuser, > day_hand_suc_incnt, > day_hand_suc_inuser > from > (select * from t_ed_soft_assist_useraction_stat where ds=20160126)t11 > full outer join > (select * from t_md_soft_lanmu_app_dload_detail where ds=20160126)t12 > on t11.qua=t12.qua and t11.app_id=t12.appid and t11.scene=t12.lanmu_id > full outer join > (select * from t_md_soft_client_install_lanmu where ds=20160126)t13 > on t11.qua=t13.qua and t11.app_id=t13.app_id and t11.scene=t13.lanmu_id > )t1 > left outer join t_rd_qua t3 on t3.ds=20160126 and t1.qua=t3.qua > left outer join t_rd_soft_appnew_last t4 on t4.ds=20160126 and > t1.app_id=t4.app_id > left outer join t_rd_soft_page_conf t5 on t5.ds=20160126 and > t1.scene=t5.pageid and t3.client_type_id=t5.ismtt; > > > *Explain query is* > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > > STAGE PLANS: > Stage: Stage-2 > Spark > DagName: mqq_20160127151826_e8197f40-18d7-430c-9fc8-993facb74534:2 > Vertices: > Map 6 > Map Operator Tree: > TableScan > alias: t3 > Statistics: Num rows: 1051 Data size: 113569 Basic > stats: COMPLETE Column stats: NONE > Spark HashTable Sink Operator > keys: > 0 _col0 (type: string) > 1 qua (type: string) > Local Work: > Map Reduce Local Work > Map 7 > Map Operator Tree: > TableScan > alias: t4 > Statistics: Num rows: 2542751 Data size: 220433659 Basic > stats: COMPLETE Column stats: NONE > Spark HashTable Sink Operator > keys: > 0 UDFToDouble(_col2) (type: double) > 1 UDFToDouble(app_id) (type: double) > Local Work: > Map Reduce Local Work > Map 8 > Map Operator Tree: > TableScan > alias: t5 > Statistics: Num rows: 143 Data size: 28605 Basic stats: > COMPLETE Column stats: NONE > Spark HashTable Sink Operator > keys: > 0 _col1 (type: string), UDFToDouble(_col20) (type: > double) > 1 pageid (type: string), UDFToDouble(ismtt) (type: > double) > Local Work: > Map Reduce Local Work > > Stage: Stage-1 > Spark > Edges: > Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 5), Map 4 > (PARTITION-LEVEL SORT, 5), Map 5 (PARTITION-LEVEL SORT, 5) > Reducer 3 <- Reducer 2 (GROUP, 1) > DagName: mqq_20160127151826_e8197f40-18d7-430c-9fc8-993facb74534:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: t_ed_soft_assist_useraction_stat > Statistics: Num rows: 16368107 Data size: 651461220 > Basic stats: COMPLETE Column stats: NONE > Select Operator > expressions: qua (type: string), scene (type: string), > app_id (type: string) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 16368107 Data size: 651461220 > Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: string), > UDFToDouble(_col2) (type: double), _col1 (type: string) > sort order: +++ > Map-reduce partition columns: _col0 (type: string), > UDFToDouble(_col2) (type: double), _col1 (type: string) > Statistics: Num rows: 16368107 Data size: 651461220 > Basic stats: COMPLETE Column stats: NONE > value expressions: _col2 (type: string) > Map 4 > Map Operator Tree: > TableScan > alias: t_md_soft_lanmu_app_dload_detail > Statistics: Num rows: 2503976 Data size: 203324640 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: qua (type: string), appid (type: bigint), > lanmu_id (type: string) > outputColumnNames: _col2, _col3, _col4 > Statistics: Num rows: 2503976 Data size: 203324640 > Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col2 (type: string), > UDFToDouble(_col3) (type: double), _col4 (type: string) > sort order: +++ > Map-reduce partition columns: _col2 (type: string), > UDFToDouble(_col3) (type: double), _col4 (type: string) > Statistics: Num rows: 2503976 Data size: 203324640 > Basic stats: COMPLETE Column stats: NONE > value expressions: _col3 (type: bigint) > Map 5 > Map Operator Tree: > TableScan > alias: t_md_soft_client_install_lanmu > Statistics: Num rows: 2143605 Data size: 178502520 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: lanmu_id (type: string), qua (type: > string), app_id (type: int) > outputColumnNames: _col13, _col2, _col3 > Statistics: Num rows: 2143605 Data size: 178502520 > Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col2 (type: string), > UDFToDouble(_col3) (type: double), _col13 (type: string) > sort order: +++ > Map-reduce partition columns: _col2 (type: string), > UDFToDouble(_col3) (type: double), _col13 (type: string) > Statistics: Num rows: 2143605 Data size: 178502520 > Basic stats: COMPLETE Column stats: NONE > value expressions: _col3 (type: int) > Reducer 2 > Local Work: > Map Reduce Local Work > Reduce Operator Tree: > Join Operator > condition map: > Outer Join 0 to 1 > Outer Join 0 to 2 > keys: > 0 _col0 (type: string), UDFToDouble(_col2) (type: > double), _col1 (type: string) > 1 _col2 (type: string), UDFToDouble(_col3) (type: > double), _col4 (type: string) > 2 _col2 (type: string), UDFToDouble(_col3) (type: > double), _col13 (type: string) > outputColumnNames: _col0, _col1, _col2, _col10, _col11, > _col12, _col26, _col27, _col37 > Statistics: Num rows: 36009836 Data size: 1433214715 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: COALESCE(_col0,_col10,_col26) (type: > string), COALESCE(_col1,_col12,_col37) (type: string), > COALESCE(_col2,_col11,_col27) (type: string) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 36009836 Data size: 1433214715 > Basic stats: COMPLETE Column stats: NONE > Map Join Operator > condition map: > Left Outer Join0 to 1 > keys: > 0 _col0 (type: string) > 1 qua (type: string) > outputColumnNames: _col1, _col2, _col20 > input vertices: > 1 Map 6 > Statistics: Num rows: 39610820 Data size: 1576536220 > Basic stats: COMPLETE Column stats: NONE > Map Join Operator > condition map: > Left Outer Join0 to 1 > keys: > 0 UDFToDouble(_col2) (type: double) > 1 UDFToDouble(app_id) (type: double) > outputColumnNames: _col1, _col20 > input vertices: > 1 Map 7 > Statistics: Num rows: 43571902 Data size: 1734189879 > Basic stats: COMPLETE Column stats: NONE > Map Join Operator > condition map: > Left Outer Join0 to 1 > keys: > 0 _col1 (type: string), UDFToDouble(_col20) > (type: double) > 1 pageid (type: string), UDFToDouble(ismtt) > (type: double) > input vertices: > 1 Map 8 > Statistics: Num rows: 47929093 Data size: > 1907608908 Basic stats: COMPLETE Column stats: NONE > Group By Operator > aggregations: count(1) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 8 Basic > stats: COMPLETE Column stats: NONE > value expressions: _col0 (type: bigint) > Reducer 3 > Reduce Operator Tree: > Group By Operator > aggregations: count(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > > Time taken: 3.723 seconds, Fetched: 173 row(s) > > > > *Besh wishes.* > *Thank for any help.* >