[jira] [Commented] (HIVE-9153) Evaluate CombineHiveInputFormat versus HiveInputFormat [Spark Branch]

Xuefu Zhang (JIRA) Thu, 25 Dec 2014 11:44:50 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-9153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258841#comment-14258841
 ]


Xuefu Zhang commented on HIVE-9153:
-----------------------------------

Re: Utilities.getBaseWork() changes, I suppose Rui is probably trying to clean 
up some redundant (useless) code. The changed code would be equivalent to the 
old one if "name" is the full path of the plan file on HDFS for non-local mode, 
which is very possible but needs to be confirmed.

> Evaluate CombineHiveInputFormat versus HiveInputFormat [Spark Branch]
> ---------------------------------------------------------------------
>
>                 Key: HIVE-9153
>                 URL: https://issues.apache.org/jira/browse/HIVE-9153
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Brock Noland
>            Assignee: Rui Li
>         Attachments: HIVE-9153.1-spark.patch, HIVE-9153.1-spark.patch, 
> screenshot.PNG
>
>
> The default InputFormat is {{CombineHiveInputFormat}} and thus HOS uses this. 
> However, Tez uses {{HiveInputFormat}}. Since tasks are relatively cheap in 
> Spark, it might make sense for us to use {{HiveInputFormat}} as well. We 
> should evaluate this on a query which has many input splits such as {{select 
> count(\*) from store_sales where something is not null}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9153) Evaluate CombineHiveInputFormat versus HiveInputFormat [Spark Branch]

Reply via email to