[ 
https://issues.apache.org/jira/browse/HIVE-10671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539287#comment-14539287
 ] 

Rui Li commented on HIVE-10671:
-------------------------------

Why does each table have 2 sizes? The following is the output of the same 
command on my cluster:
{code}
[root@node13-1 ~]# hadoop fs -du -h /user/hive/warehouse/tpch_flat_orc_320.db
2.4 G    /user/hive/warehouse/tpch_flat_orc_320.db/customer
53.8 G   /user/hive/warehouse/tpch_flat_orc_320.db/lineitem
1.7 K    /user/hive/warehouse/tpch_flat_orc_320.db/nation
12.6 G   /user/hive/warehouse/tpch_flat_orc_320.db/orders
1.2 G    /user/hive/warehouse/tpch_flat_orc_320.db/part
9.2 G    /user/hive/warehouse/tpch_flat_orc_320.db/partsupp
980      /user/hive/warehouse/tpch_flat_orc_320.db/region
156.8 M  /user/hive/warehouse/tpch_flat_orc_320.db/supplier
{code}
Q22 runs for about 57s in both yarn-client and yarn-cluster mode on my side.
I'll try other cases.

> yarn-cluster mode offers a degraded performance from yarn-client [Spark 
> Branch]
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-10671
>                 URL: https://issues.apache.org/jira/browse/HIVE-10671
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Rui Li
>
> With Hive on Spark, users noticed that in certain cases 
> spark.master=yarn-client offers 2x or 3x better performance than if 
> spark.master=yarn-cluster. However, yarn-cluster is what we recommend and 
> support. Thus, we should investigate and fix the problem. One of the such 
> queries is TPC-H  22.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to