*Some properties on hive-site.xml is*

<property>
<name>hive.ignore.mapjoin.hint</name>
<value>false</value>
</property>
<property>
<name>hive.auto.convert.join</name>
<value>true</value>
</property>
<property>
<name>hive.auto.convert.join.noconditionaltask</name>
<value>true</value>


*If more information is required,please let us know.*

*Thanks.*

2016-01-27 15:20 GMT+08:00 Jone Zhang <joyoungzh...@gmail.com>:

> *I have run a query many times, there will be two results without regular.*
> *One is 36834699 and other is 18464706.*
>
> *The query is *
> set spark.yarn.queue=soft.high;
> set hive.execution.engine=spark;
> select /*+mapjoin(t3,t4,t5)*/
>   count(1)
> from
>   (
>   select
>     coalesce(t11.qua,t12.qua,t13.qua) qua,
>     coalesce(t11.scene,t12.lanmu_id,t13.lanmu_id) scene,
>     coalesce(t11.app_id,t12.appid,t13.app_id) app_id,
>     expos_pv,
>     expos_uv,
>     dload_pv,
>     dload_uv,
>     dload_cnt,
>     dload_user,
>     evil_dload_cnt,
>     evil_dload_user,
>     update_dcnt,
>     update_duser,
>     hand_suc_incnt,
>     hand_suc_inuser,
>     day_hand_suc_incnt,
>     day_hand_suc_inuser
>   from
>     (select * from t_ed_soft_assist_useraction_stat where ds=20160126)t11
>     full outer join
>     (select * from t_md_soft_lanmu_app_dload_detail where ds=20160126)t12
>     on t11.qua=t12.qua and t11.app_id=t12.appid and t11.scene=t12.lanmu_id
>     full outer join
>     (select * from t_md_soft_client_install_lanmu  where ds=20160126)t13
>     on t11.qua=t13.qua and t11.app_id=t13.app_id and t11.scene=t13.lanmu_id
>   )t1
>   left outer join t_rd_qua t3 on t3.ds=20160126 and t1.qua=t3.qua
>   left outer join t_rd_soft_appnew_last t4 on t4.ds=20160126 and
> t1.app_id=t4.app_id
>   left outer join t_rd_soft_page_conf t5 on t5.ds=20160126 and
> t1.scene=t5.pageid and t3.client_type_id=t5.ismtt;
>
>
> *Explain query is*
> STAGE DEPENDENCIES:
>   Stage-2 is a root stage
>   Stage-1 depends on stages: Stage-2
>   Stage-0 depends on stages: Stage-1
>
> STAGE PLANS:
>   Stage: Stage-2
>     Spark
>       DagName: mqq_20160127151826_e8197f40-18d7-430c-9fc8-993facb74534:2
>       Vertices:
>         Map 6
>             Map Operator Tree:
>                 TableScan
>                   alias: t3
>                   Statistics: Num rows: 1051 Data size: 113569 Basic
> stats: COMPLETE Column stats: NONE
>                   Spark HashTable Sink Operator
>                     keys:
>                       0 _col0 (type: string)
>                       1 qua (type: string)
>             Local Work:
>               Map Reduce Local Work
>         Map 7
>             Map Operator Tree:
>                 TableScan
>                   alias: t4
>                   Statistics: Num rows: 2542751 Data size: 220433659 Basic
> stats: COMPLETE Column stats: NONE
>                   Spark HashTable Sink Operator
>                     keys:
>                       0 UDFToDouble(_col2) (type: double)
>                       1 UDFToDouble(app_id) (type: double)
>             Local Work:
>               Map Reduce Local Work
>         Map 8
>             Map Operator Tree:
>                 TableScan
>                   alias: t5
>                   Statistics: Num rows: 143 Data size: 28605 Basic stats:
> COMPLETE Column stats: NONE
>                   Spark HashTable Sink Operator
>                     keys:
>                       0 _col1 (type: string), UDFToDouble(_col20) (type:
> double)
>                       1 pageid (type: string), UDFToDouble(ismtt) (type:
> double)
>             Local Work:
>               Map Reduce Local Work
>
>   Stage: Stage-1
>     Spark
>       Edges:
>         Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 5), Map 4
> (PARTITION-LEVEL SORT, 5), Map 5 (PARTITION-LEVEL SORT, 5)
>         Reducer 3 <- Reducer 2 (GROUP, 1)
>       DagName: mqq_20160127151826_e8197f40-18d7-430c-9fc8-993facb74534:1
>       Vertices:
>         Map 1
>             Map Operator Tree:
>                 TableScan
>                   alias: t_ed_soft_assist_useraction_stat
>                   Statistics: Num rows: 16368107 Data size: 651461220
> Basic stats: COMPLETE Column stats: NONE
>                   Select Operator
>                     expressions: qua (type: string), scene (type: string),
> app_id (type: string)
>                     outputColumnNames: _col0, _col1, _col2
>                     Statistics: Num rows: 16368107 Data size: 651461220
> Basic stats: COMPLETE Column stats: NONE
>                     Reduce Output Operator
>                       key expressions: _col0 (type: string),
> UDFToDouble(_col2) (type: double), _col1 (type: string)
>                       sort order: +++
>                       Map-reduce partition columns: _col0 (type: string),
> UDFToDouble(_col2) (type: double), _col1 (type: string)
>                       Statistics: Num rows: 16368107 Data size: 651461220
> Basic stats: COMPLETE Column stats: NONE
>                       value expressions: _col2 (type: string)
>         Map 4
>             Map Operator Tree:
>                 TableScan
>                   alias: t_md_soft_lanmu_app_dload_detail
>                   Statistics: Num rows: 2503976 Data size: 203324640 Basic
> stats: COMPLETE Column stats: NONE
>                   Select Operator
>                     expressions: qua (type: string), appid (type: bigint),
> lanmu_id (type: string)
>                     outputColumnNames: _col2, _col3, _col4
>                     Statistics: Num rows: 2503976 Data size: 203324640
> Basic stats: COMPLETE Column stats: NONE
>                     Reduce Output Operator
>                       key expressions: _col2 (type: string),
> UDFToDouble(_col3) (type: double), _col4 (type: string)
>                       sort order: +++
>                       Map-reduce partition columns: _col2 (type: string),
> UDFToDouble(_col3) (type: double), _col4 (type: string)
>                       Statistics: Num rows: 2503976 Data size: 203324640
> Basic stats: COMPLETE Column stats: NONE
>                       value expressions: _col3 (type: bigint)
>         Map 5
>             Map Operator Tree:
>                 TableScan
>                   alias: t_md_soft_client_install_lanmu
>                   Statistics: Num rows: 2143605 Data size: 178502520 Basic
> stats: COMPLETE Column stats: NONE
>                   Select Operator
>                     expressions: lanmu_id (type: string), qua (type:
> string), app_id (type: int)
>                     outputColumnNames: _col13, _col2, _col3
>                     Statistics: Num rows: 2143605 Data size: 178502520
> Basic stats: COMPLETE Column stats: NONE
>                     Reduce Output Operator
>                       key expressions: _col2 (type: string),
> UDFToDouble(_col3) (type: double), _col13 (type: string)
>                       sort order: +++
>                       Map-reduce partition columns: _col2 (type: string),
> UDFToDouble(_col3) (type: double), _col13 (type: string)
>                       Statistics: Num rows: 2143605 Data size: 178502520
> Basic stats: COMPLETE Column stats: NONE
>                       value expressions: _col3 (type: int)
>         Reducer 2
>             Local Work:
>               Map Reduce Local Work
>             Reduce Operator Tree:
>               Join Operator
>                 condition map:
>                      Outer Join 0 to 1
>                      Outer Join 0 to 2
>                 keys:
>                   0 _col0 (type: string), UDFToDouble(_col2) (type:
> double), _col1 (type: string)
>                   1 _col2 (type: string), UDFToDouble(_col3) (type:
> double), _col4 (type: string)
>                   2 _col2 (type: string), UDFToDouble(_col3) (type:
> double), _col13 (type: string)
>                 outputColumnNames: _col0, _col1, _col2, _col10, _col11,
> _col12, _col26, _col27, _col37
>                 Statistics: Num rows: 36009836 Data size: 1433214715 Basic
> stats: COMPLETE Column stats: NONE
>                 Select Operator
>                   expressions: COALESCE(_col0,_col10,_col26) (type:
> string), COALESCE(_col1,_col12,_col37) (type: string),
> COALESCE(_col2,_col11,_col27) (type: string)
>                   outputColumnNames: _col0, _col1, _col2
>                   Statistics: Num rows: 36009836 Data size: 1433214715
> Basic stats: COMPLETE Column stats: NONE
>                   Map Join Operator
>                     condition map:
>                          Left Outer Join0 to 1
>                     keys:
>                       0 _col0 (type: string)
>                       1 qua (type: string)
>                     outputColumnNames: _col1, _col2, _col20
>                     input vertices:
>                       1 Map 6
>                     Statistics: Num rows: 39610820 Data size: 1576536220
> Basic stats: COMPLETE Column stats: NONE
>                     Map Join Operator
>                       condition map:
>                            Left Outer Join0 to 1
>                       keys:
>                         0 UDFToDouble(_col2) (type: double)
>                         1 UDFToDouble(app_id) (type: double)
>                       outputColumnNames: _col1, _col20
>                       input vertices:
>                         1 Map 7
>                       Statistics: Num rows: 43571902 Data size: 1734189879
> Basic stats: COMPLETE Column stats: NONE
>                       Map Join Operator
>                         condition map:
>                              Left Outer Join0 to 1
>                         keys:
>                           0 _col1 (type: string), UDFToDouble(_col20)
> (type: double)
>                           1 pageid (type: string), UDFToDouble(ismtt)
> (type: double)
>                         input vertices:
>                           1 Map 8
>                         Statistics: Num rows: 47929093 Data size:
> 1907608908 Basic stats: COMPLETE Column stats: NONE
>                         Group By Operator
>                           aggregations: count(1)
>                           mode: hash
>                           outputColumnNames: _col0
>                           Statistics: Num rows: 1 Data size: 8 Basic
> stats: COMPLETE Column stats: NONE
>                           Reduce Output Operator
>                             sort order:
>                             Statistics: Num rows: 1 Data size: 8 Basic
> stats: COMPLETE Column stats: NONE
>                             value expressions: _col0 (type: bigint)
>         Reducer 3
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: count(VALUE._col0)
>                 mode: mergepartial
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
>                 File Output Operator
>                   compressed: false
>                   Statistics: Num rows: 1 Data size: 8 Basic stats:
> COMPLETE Column stats: NONE
>                   table:
>                       input format:
> org.apache.hadoop.mapred.TextInputFormat
>                       output format:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                       serde:
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
>
> Time taken: 3.723 seconds, Fetched: 173 row(s)
>
>
>
> *Besh wishes.*
> *Thank for any help.*
>

Reply via email to