[ 
https://issues.apache.org/jira/browse/HIVE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423792#comment-13423792
 ] 

Namit Jain commented on HIVE-2925:
----------------------------------

It would be very difficult to deploy it this way.

In general, I can think of the following:

1. For queries with limits, this optimization should be enabled.
2. Ideally, it would be good, if there is a threshold of the limit.
3. For queries without limits, given the fact that we dont a cost based 
optimizer, it might be a good to
   have a threshold on the total input data. 

I mean, in general, non MR fetch might make sense for the following.
1. select from a small table (where small is configurable)
2. select from a big table is OK if there is a limit

Note that, it is still possible to get a plan where this optimization might 
make not sense.

For eg: select col1 from T where col2 = 10 limit 10;

It is possible that there are very rows for which col2 is 10, so not having a 
MR job may really slow down this query.
Solving that would be more difficult without more statistics. 

But, it may be a good idea to add more config parameters to tune the 
hive.aggresive.fetch.task.conversion appropriately.
It can also be done in a follow-up patch, and is independent of this.
                
> Support non-MR fetching for simple queries with select/limit/filter 
> operations only
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-2925
>                 URL: https://issues.apache.org/jira/browse/HIVE-2925
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 0.10.0
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Trivial
>         Attachments: HIVE-2925.D2607.1.patch, HIVE-2925.D2607.2.patch, 
> HIVE-2925.D2607.3.patch, HIVE-2925.D2607.4.patch
>
>
> It's trivial but frequently asked by end-users. Currently, select queries 
> with simple conditions or limit should run MR job which takes some time 
> especially for big tables, making the people irritated.
> For that kind of simple queries, using fetch task would make them happy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to